Inference code: Difference between revisions

From University of Washington - Ubicomp Research Page
Jump to navigationJump to search
mNo edit summary
mNo edit summary
 
Line 97: Line 97:
</bash>
</bash>
</ul>
</ul>
=== Compile and Classify Test Traces ===
<ul>
Once we've generated a classifier we have to turn the Weka raw output into a usable form and then classify the data. In this case we use 'glen_sub017_lab2_features.arff' as our test data set that we want output for.
<bash>
cd ./training_bin
rm ../src_data/trainedClassifier/classifier.csv
rm ../src_data/trainedClassifier/margins.csv
perl compile_classifier.pl ../src_data/trainedClassifier/stump_output.raw > ../src_data/trainedClassifier/classifier.csv
perl run_classifier.pl ../src_data/features/glen_sub017_lab2_features.arff ../src_data/trainedClassifier/classifier.csv > ../src_data/trainedClassifier/margins.csv
cd ../
</bash>
</ul>
=== Analyze Output from Classifier ===
<ul>
After classification we want to create a confusion matrix to see how well our classifier is doing.
<bash>
cd ./training_bin
rm ../src_data/trainedClassifier/confusion.csv
cat ../src_data/trainedClassifier/margins.csv | perl gen_confusion_matrix.pl geom 0.0 > ../src_data/trainedClassifier/confusion.csv
cat ../src_data/trainedClassifier/confusion.csv
cd ../
</bash>
</ul>
</ul>
</ul>

Latest revision as of 18:36, 15 April 2008

Creating Labeled UWAR Files

    uwar_combine takes several .uwar files or a directory containing a single session and creates 1 contiguous UWAR file. It can also take a text label file and insert the labels as TAG items in the combined UWAR stream. Label files are of the form timestamp (in seconds), newline, string label:
    0.000000
    null
    703.125000
    null
    884.375000
    walk2
    1000.000000
    walk3
    1074.218750
    walk3
    1250.000000
    walk4
    1601.562500
    run1
    1777.343750
    walk1
    

    Example Command: <bash>uwar_combine -scandir "001 MSB lab 1 - 100907/" -labelFile "001 MSB lab 1 - 100907/sub1-lab1-ms_resampled.txt" -out ./src_data/glen_sub001_lab1_combined.uwar</bash>

Creating ARFF Training Features files

    You can run the inference engine without a classifier, specifying just the features (in the .XML configuration file) you would like to compute. When you use the -label command it the inference engine will read the TAG objects from the UWAR stream as labels for ground truth. Note that you must have a label in the UWAR stream every ~2137 seconds otherwise the inference engine will crash. Example Command: <bash>inference -xml generateFeaturesForTraining.xml -uwarin ./src_data/glen_sub001_lab1_combined.uwar -arffout ./trainingFeatures/glen_sub001_lab1_features.arff -label -silent></bash> Example .xml training file: <xml></xml>

Training Process

    Generate Source .uwar files with Labels

      <bash> ./bin/uwar_combine -scandir "/projects/ubicomp3/glen/subject-tests/MSB-001/001 MSB lab 1 - 100907/" -labelFile "/projects/ubicomp3/glen/subject-tests/MSB-001/001 MSB lab 1 - 100907/sub1-lab1-ms.txt" -out ./src_data/glen_sub001_lab1_combined.uwar ./bin/uwar_combine -scandir "/projects/ubicomp3/glen/subject-tests/MSB-006/006 MSB lab 1/" -labelFile "/projects/ubicomp3/glen/subject-tests/MSB-006/006 MSB lab 1/s ub6-lab1-ms.txt" -out ./src_data/glen_sub006_lab1_combined.uwar ./bin/uwar_combine -scandir "/projects/ubicomp3/glen/subject-tests/MSB-017/017 MSB lab 2 - 102207/" -labelFile "/projects/ubicomp3/glen/subject-tests/MSB-017/017 MSB lab 2 - 102207/sub17-lab2-ms.txt" -out ./src_data/glen_sub017_lab2_combined.uwar </bash>

    Generate Features by running inference

      Run the inference engine without a classifier model to simply output ARFF files containing computed features. <bash>
      1. Generate training features:
      2. infernce -xml generateFeaturesForTraining.xml -uwarin <file> -arffout <output> -label -slient
      ./bin/inference -xml generateFeaturesForTraining.xml -uwarin ./src_data/glen_sub001_lab1_combined.uwar -arffout ./src_data/features/glen_sub001_lab1_features.arff -label -si lent ./bin/inference -xml generateFeaturesForTraining.xml -uwarin ./src_data/glen_sub006_lab1_combined.uwar -arffout ./src_data/features/glen_sub006_lab1_features.arff -label -si lent ./bin/inference -xml generateFeaturesForTraining.xml -uwarin ./src_data/glen_sub017_lab2_combined.uwar -arffout ./src_data/features/glen_sub017_lab2_features.arff -label -si lent </bash>

    Create a training set

      Combine the .arff files into a single training arff file: <bash>
      1. Combine sub001 and sub006 to create our training set:
      echo "Combining .arff files to create Training set" rm ./src_data/train/sub001_sub006_features.arff ./training_bin/arffcat.pl ./src_data/features/glen_sub001_lab1_features.arff ./src_data/features/glen_sub006_lab1_features.arff > ./src_data/train/sub001_sub006_features.arf f </bash>

    Create the Class/ Not Class ARFF files

      To train the boosted decision stumps classifier we need examples of Class and Not Class to feed the stumps classifier. allacts2binacts.pl performs this operation (using actset2binactset.pl). It will generate output files: <user prefix>PositiveClassName.arff <bash> cd ./training_bin perl allacts2binats.pl ../src_data/train/sub001_sub006_features.arff ../src_data/train/trainSet_sub001_006__ cd .. </bash>

    Train the Boosted Stumps Classifiers

      Train the classifier using our Class/Not Class ARFF files: <bash>
      1. perl ./training_bin/boostedstumpall.pl <input file path> <input file base> <output file>
      cd ./training_bin perl boostedstumpall.pl ../src_data/train/ trainSet_sub001_006__ ../src_data/trainedClassifier/boostedClassifier cd .. </bash>

    Compile and Classify Test Traces

      Once we've generated a classifier we have to turn the Weka raw output into a usable form and then classify the data. In this case we use 'glen_sub017_lab2_features.arff' as our test data set that we want output for. <bash> cd ./training_bin rm ../src_data/trainedClassifier/classifier.csv rm ../src_data/trainedClassifier/margins.csv perl compile_classifier.pl ../src_data/trainedClassifier/stump_output.raw > ../src_data/trainedClassifier/classifier.csv perl run_classifier.pl ../src_data/features/glen_sub017_lab2_features.arff ../src_data/trainedClassifier/classifier.csv > ../src_data/trainedClassifier/margins.csv cd ../ </bash>

    Analyze Output from Classifier

      After classification we want to create a confusion matrix to see how well our classifier is doing. <bash> cd ./training_bin rm ../src_data/trainedClassifier/confusion.csv cat ../src_data/trainedClassifier/margins.csv | perl gen_confusion_matrix.pl geom 0.0 > ../src_data/trainedClassifier/confusion.csv cat ../src_data/trainedClassifier/confusion.csv cd ../ </bash>