Material recognition using Triboelectric Nanogenerator-based sensing and machine learning
Experimental Setup
Development of a capable machine learning model
The goal of the project is to determine if the voltage and current output of the TENG can be used to recognize which material it is being pushed against.
Since the data is sequential, a LSTM was choosen as model. The implementation is made using python-pytorch.
Data preparation
- Resample/interpolate the data, so that all voltage and current reading have the same time interval between them
- Split each recording into pieces of the same length of
n
points (usingDataSplitter(n)
)
Phase 1
For the initial model, 32 different models/training settings were tested with the data.
Common Settings
- number of features:
1
- only voltage data - train/test data split Ratio:
0.7
- common random state/seed
42
for all shufflers (to get the same training and validation data every time) - number of epochs:
60
- loss function:
CrossEntropyLoss
Varied Settings
- num_layers:
2
(331 points)4
(165 points)
- hidden_size:
5
(251 points)10
(245 points)
- bidirectional:
True
(299 points)False
(197 points)
- scheduler:
ExponentialLR(gamma=0.95, optimizer=Adam (initial_lr: 0.1, ...), ...)
(307 points)ExponentialLR(gamma=0.95, optimizer=Adam (initial_lr: 0.25, ...), ...)
(189 points)
- splitter:
DataSplitter(100)
(266 points)DataSplitter(300)
(230 points)
The settings were ranked by assigning the rank of the corresponding model as points to the setting. For example, the best model out of the 32 models was bidirectional, so bidirectional = True
got 32 points for that.
Most of the models achieved validation accuaries of ~20%
, which is just a bit more than chance. 4 models achieved > 70%
and one got 80%
.
Training summary
Validation
Interpretation
- more layers and larger hidden size not necessarily better (however: best model had
hidden_size = 10
andnum_layers = 4
) - initial learning rate should be reduced to about
0.04
- bidirectional LSTM seems superior in this case
- sequence lengths of 100-300 are optimal
Phase 2
Phase 2 used the lessons from Phase 1, ie. always using bidirectional LSTM and lower learning rate.
With those, the best model from Phase achieved an accuracy of 90.1%
, using 4 layers, a hidden size of 8 and a sequence length of 200.
Normalizing the data
For testing purposes, some models were trained on data where the voltage readings were normalized between 0 and 1.
This makes it independent of voltage amplitude, which will vary when the sensor is pushed into the material with a different force.
This is very likely to occur even when using the same setup with the same material again.
However, all models using the normalized data achieved accuaries of ~15%
, which is less than pure chance.
Phase 3
Source code
The complete source code for the data collection script, as well as the machine learning (model training + model evaluation) can be found on my gitea.