Step detection

Running

inference can be run either with live or canned input.

Canned input: UWAR files

Canned input is useful for development and debugging, and for the fact that it can be run on x86 and without central or msb-server. <bash>./inference -xml peak_detection.xml -uwarin log_file.uwar</bash> peak_detection.xml represents the file describing the mill DAG. log_file.uwar represents the file with the canned input in UWAR format, collected with logger.

Live input

To do realtime step detection, inference needs to be run on the IMote, along with central and msb-server (and bt-listener if you'd like the output sent over bluetooth e.g. to a phone). <bash>./inference -xml peak_detection.xml -bt2 60</bash> Again, peak_detection.xml represents the file describing the mill DAG. But in this case the input comes directly from msb-server (over shared memory).

To run it automatically, see running processes with proccontrol.

Output

The output of inference is a comma-separated list of "features." If the output is sent over bluetooth (i.e. the -bt or -bt2 n option is set), then it contains descriptors for the features. The bt2 format has the following form:

((#DESC#descriptor(,descriptor)*|#DATA#value(,value)*)\r)*

Be sure to note that lines are terminated with a carriage return (ASCII 0xd).

So for example:

#DATA#12345,1,2,3
#DATA#12346,1,2,3
#DESC#TIME,steps_05s,stridelength_mean,stridelength_stddev
#DATA#12347,1,2,3
#DATA#12348,1,2,3

The number and order of the elements comes directly from the inference XML description. It won't change any more often than the XML file.

The time is milliseconds since the epoch, but bear in mind that the IMote's clock will be wrong unless you set it. It's more likely to be simply uptime.

Algorithm

The algorithm is simple in principle. It seeks to partition the input on (somewhat arbitrary) step boundaries and then find the maximum of the signal in each partition. At a high level, it does roughly this:

Take the magnitude of the 3d accelerometer vector.
Smooth the magnitude.
Take the derivative of the smoothed magnitude.
Look for peaks in the smoothed magnitude between ascending zero-crossings of the derivative.

Step 1 is straightforward. Rather than looking at any component of the 3D acceleration, the algorithm considers only the magnitude of the vector. This means no attempt is made to determine the orientation of the device and there is no dependence on any particular orientation.

Step 2 is somewhat more complicated and has been done in two different ways. The problem is that the magnitude data are not smooth at the granularity in which we're interested. There are high-frequency effects from effectively superimposing the three acceleration components. There are also other high-frequency, low-amplitude elements of the signal which are presumed not to be important to partitioning the data; they may be noise or simply some other uncorrelated motion. The goal then is to separate the periodic characteristics due to bipedal motion from those not.

The first approach taken to the smoothing problem came from a pair of assumptions: that the interesting signal was at about 1-4 hz, and that most uninteresting features of the signal were low-amplitude. The smoother first put the signal through a simple low-pass filter: the mean over a sliding window. Then step 3 was run to find local extrema. If the absolute value from one extremum to the next was small, THIS IS PAINFUL COME BACK TO IT

The second (and current) approach took the view that the signall was at some low (but unknown) frequency, and that the uninteresting parts of the signal were the higher-frequency components. It addressed this directly with a filter with a frequency-based cutoff. The cutoff, however, is dynamic.

The FFT of the signal over a window is taken. The total energy (less the DC value) is then determined. A theshold portion of that energy is picked as likely containing most of the interesting parts of the signal. The fewest number of the lowest frequency bins are then included that contain at least that much energy. The inverse FFT is then taken.

This works well over the window, but the dynamic choice of the energy cut-off means that a different filter is applied to each window. This leaves a problem of smoothly stitching together the inverse FFTs. The applied solution is to slide the window by half the window length, then combine the outputs with a weighted average which gives more weight toward the center of the window.

Step 3 is performed to locate local extrema in the smoothed signal. The minima are presumed to be mid-step.

Finally, peak is the maximum of the signal between two mid-step boundaries.

Step detection XML

The WorkDag contains the description of the dataflow. Its immediate sub-elements are "mills," logical steps in the computation. The DAG may specify many inputs (sensor mills) but has only one output mill, specified by the WorkDag attribute resultCollectionName. <xml> <?xml version="1.0" standalone="no"?>

<WorkDag resultCollectionName="features"

       maxCheckupPeriod="1000000" >

</xml> We instantiate the accelerometer sensor mill. All sensor mills use only the UPDATE_UNCONDITIONAL update policy. The sample period is given in microseconds. Note that this sample period is only a cross-check; it is ultimately determined by msb-server from /msp/msb.ini. 1953 μs corresponds to 512 Hz. <xml>

 <Sensor name="accel"
         updatePolicy="UPDATE_UNCONDITIONAL"
         sampPeriod="1953"
 />

</xml> The accelerometer sensor mill is a MillMo with three outputs per sample—the x- y- and z-components of a 3-dimensional accelerometer reading. The first step of the algorithm takes the 1-dimensional magnitude of each reading. UPDATE_BY_SRC_PIGGYBACK causes a mill to be updated whenever its source mill is. <xml>

 <ShortMag name="accelmag"
         srcName="accel"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
 />

</xml> An FFT is applied over a sliding window to operate on the signal in the frequency domain. Together, UPDATE_BY_NUM_NEW_SRC_SAMP and numSrcSampPerUpdate="1024" specify that accelmag_fft is updated once for every 1024 samples of input. (At every update, it produces one output record.) However, numSrcSampRqd="2048" indicates to accelmag_fft that the FFT is to be performed over the last 2048 samples. Thus its output windows overlap by half. This is required later for smoothing out the inverse FFT after a dynamic filter is applied.

This mill is sensitive to the sample rate. For example, to cut the sample rate in half from 512 Hz to 256 Hz and have this mill operate over the same window length (in time), numSrcSampPerUpdate and numSrcSampRqd would also need to be cut in half. <xml>

 <Fft16 name="accelmag_fft"
         srcName="accelmag"
         updatePolicy="UPDATE_BY_NUM_NEW_SRC_SAMP"
         numSrcSampPerUpdate="1024"
         numSrcSampRqd="2048"
 />

</xml> The energy threshold is a simple mill which zeros its output if the energy of its input is not above the given threshold. Since the energy of a discrete-time function is not time-normalized, this mill too is sensitive to the sample rate. If the sample rate were halved, so should threshold be.

The purpose of this mill is to avoid detecting steps when very little is detected on the accelerometer. However, to keep the mean of the signal steady, the DC is still passed through. This may be a mistake—it means variations at frequencies lower than the window size still get through, raising the need for still another mechanism to suppress detection when idle (peaks_max_sane). <xml>

 <EnergyThreshold name="accelmag_fft_thresholded"
         srcName="accelmag_fft"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         threshold="250000"
 />

</xml> accelmag_fft_filtered operates on the signal in the frequency domain based on the relative energy content of the bins. I'm not really sure how threshold will scale with changes to the sample rate. Note that as with accelmag_fft_thresholded, accelmag_fft_filtered is updated once with every update of its source. <xml>

 <EnergyFilter name="accelmag_fft_filtered"
         srcName="accelmag_fft_thresholded"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         threshold=".16"
 />

</xml> accelmag_smooth returns the signal to the time domain. The attribute overlap="1" indicates that it expects each window to be twice the size of the number of outputs; that is, it is relying on accelmag_fft's 1:2 ratio of numSrcSampPerUpdate to numSrcSampRqd. Fft16I does something no other mill in LIRA has done yet: output more than one sample per update. So while the mill itself is updated here once every 2 seconds, at each update it produces 1024 samples of output. This means that if a downstream mill wants to see every sample, it must request to be updated for every sample, rather than by piggyback. <xml>

 <Fft16I name="accelmag_smooth"
         srcName="accelmag_fft_filtered"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         overlap="1"
 />

</xml> The integer FIR filter simply takes the derivative of the signal. It comes directly from Ying et al's "Automatic Step Detection in the Accelerometer Signal." <xml>

 <IntegerFirFilter name="deriv"
         srcName="accelmag_smooth"
         updatePolicy="UPDATE_BY_NUM_NEW_SRC_SAMP"
         numSrcSampPerUpdate="1" >
   2 1 0 -1 -2
 </IntegerFirFilter>

</xml> The zero crossing mill flags output records where the signal crosses zero in the direction specified by edge. <xml>

 <ZeroCross name="peaks_max"
         srcName="deriv"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         edge="falling"
 />

</xml> peaks_max_sane is used to suppress very low-frequency variation, the final component in stopping idle detection. It has no effect when a signal is being detected at at least 1 Hz. (It is sensitive to the sample rate.) <xml>

 <TimeDiffLim name="peaks_max_sane"
         srcName="peaks_max"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         minSamps="16"
         maxSamps="512"
 />

</xml> This configuration computes a couple simple features. steps_05s is the number of detected steps in the last 5 seconds. The window length is sensitive to the sample rate. stridelength computes the mean and standard deviation of the last windowLength=16 strides. It has some simple logic that handles large gaps which, again, is sensitive to the sample rate. <xml>

 <WinSum name="steps_05s"
         srcName="peaks_max_sane"
         updatePolicy="UPDATE_BY_NUM_NEW_SRC_SAMP"
         numSrcSampPerUpdate="128"
         windowLength="2560"
 />

 <TimeDiffShort name="stridelength"
         srcName="peaks_max_sane"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         windowLength="16"
         deltaMax="512"
 />

 <SplitShort name="stridelength_mean"
         srcName="stridelength"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         componentName="mean"
 />

 <SplitShort name="stridelength_stddev"
         srcName="stridelength"
         updatePolicy="UPDATE_BY_SRC_PIGGYBACK"
         componentName="stddev"
 />

 <Collection name="debug"
         updatePolicy="UPDATE_BY_PERIOD"
         sampPeriod="1953" >
   <component name="accelmag" />
   <component name="accelmag_smooth" />
   <component name="deriv" />
   <component name="peaks_max_sane" />
 </Collection>

 <Collection name="features"
         updatePolicy="UPDATE_BY_PERIOD"
         sampPeriod="250000" >
   <component name="steps_05s" />
   <component name="stridelength_mean" />
   <component name="stridelength_stddev" />
 </Collection>

</WorkDag> </xml>

Mills

During the course of the implementation of the algorithm, several new mills were created. Many were set aside in the final mill DAG, but the following is a description of them all anyway.

Please see the description of LIRA on Intel's MSP wiki for background on the architecture and mill types and terminology.

AccelIntervalsMill

MillSiMo
Computes the difference in the values of a signal (one component) between two events in another (another component).
Passes through its inputs with the addition of a field with the deltas.
Was used to compute the "size" of a peaks in a previous iteration of the algorithm.

EnergyFilterMill

MillSiMo
Takes a frequency-domain input and applies the "energy filter" which tries to preserve the fewest number of bins required to describe a certain percentage of the input energy.
Used to implement a low-pass filter of the magnitude of the accelerometer signal.

EnergyThresholdMill

MillSiMo
Either passes through its input vector or zeros all but the first bin, depending on the energy of the vector.
Used to suppress low-energy portions of the signal almost entirely.

Fft16DynMill

MillSiMo
The idea was to provide a mill that computed the FFT over a variable number of input samples which would be given by another input.
Unfinished, unused.
Might be worth finishing in the future for step-detection-derived features, in combination with InterpolationMill.

Fft16IMill

MillSiSo.
Compute the inverse FFT of a vector.
Used to return a signal to the time-domain.
Capable of combining frequency-domain windows overlapping in time (required because of the use of the dynamic EnergyFilterMill).

IntegerFirFilterMill

MillSiSo
Similar to LtiFilterMill, but with only the Bs—that is, only capable of implementing finite impulse response functions.
All integer math. (LtiFilterMill uses floats and the IMote2 doesn't have floating point hardware.)
Used to implement the derivative.

InterpolationMill

MillSiMo
The ideas was to take a variable-length input vector and interpolate to generate a fixed-length output.
Unfinished, unused.
May be worth finishing in the future to support step-detection-derived features, along with Fft16DynMill.

MillDelay<T>

MillSiSo (template, instantiated for short)
Delay the input by a specified number of samples.
The idea was to fix the synchronization when a mill had to delay a computation.
Unused because computations aren't delayed by more than a few samples (milliseconds).

MillShift<T>

MillSiSo (template, instantiated for unsigned int)
Applies a bit shift.
Unused.

MillTimeDiff<T>

MillSiMo (template, instantiated for short)
Output the mean and standard deviation of times (samples) between non-zero inputs.
Used to generated the "stride length mean" and "stride length standard deviation" features.

PivotMill

MillSiSo
Presents input to output as a single vector of variable length on events in a second input signal.
Completed but unused, untested.
Could be used together with InterpolationMill to time-normalize stuff.

RangeMatchMill

MillSiSo
Categorizes the input signal by domain membership.
Hard-coded to have 3 ranges.
Used to categorize sitting/walking/running in the balance demo.

ThresholdMill

MillSiSo
Pass through the input signal unless it passes the thresholds.
Like WinThresholdMill but simpler, faster, and only the "absolute" part.
Used to suppress small peaks in a previous iteration of the algorithm.

TimeDiffLimMill

MillSiSo
Zero an record if it occurs to soon or too late after the last non-zero record.
Used to suppress very high and very low frequency events.

WinMinMaxMill

MillSiMo
Finds the min and max of a signal (one component) between two events in another (another component).
Passes through its inputs with the addition of fields for the min and max.
1 is output in the min or max field whever the min or max occurs, otherwise 0 is output.
Was used to find peaks before zero crossings of the derivative were used.

WinSumMill

MillSiSo
Output the sum of the input over the last window.
Used to generate the feature "number of steps in last 5 seconds."

WinThresholdMill

MillSiMo
Thresholds one of its input components.
Two sorts of thresholds are available, absolute and relative.
- Absolute thresholds apply to every sample.
- Relative thresholds are relative to extrema within a window.
Was used to throw away relatively small peaks in a previous iteration of the algorithm.

ZeroCrossingMill

MillSiMo
Detects when the specified input component passes zero.
Passes through its inputs as well as a field indicating whether a zero crossing occurred at a given time.
Superceded by ZeroCrossMill.

ZeroCrossMill

MillSiSo
Outputs whether its input crossed zero.
The zero crossings of the derivative of a signal occur at the signal's extrema.

Source

CVS checkout / export

To check out the latest code: <bash>cvs -d user@bicycle.cs.washington.edu:/projects/ubicomp/uwar/CVS checkout msp_peak_detection</bash>

CVS tags

We're trying to keep our branch of msp reasonably in-sync with Intel Research Seattle's. That means we periodically merge in updates from the sourceforge repository maintained by Intel.

msp_peak_detection
msp_2008_02_05
pre-merge-1
merge-1
merge-2
pre-merge-3
merge-3
step-detection-paper
pre-merge-4
merge-4

Building

msp_peak_detection is built the same way as other msp branches. <bash>cd msp_peak_detection/src ./configure # unless you're cross-compiling make cd inference make</bash> If you are cross-compiling, you'll first need to set the environment variable CROSS_COMPILE to the path to your cross-compiler and cross-binutils.

Binaries

A binary build is available: msp-step-demo.tar.gz.

Copy the contents of the tarball onto an SD card. Stick the SD card in the MSB and boot up the device. inference will be launched automatically.

Building a new image

See the articles on creating an MSP SD image and running processes with proccontrol.

Old Matlab code

Paper

Ryan Libby wrote his undergraduate honors thesis on this algorithm, its implementation, and its performance. Link to PDF.