Facemorph
Class GMM

java.lang.Object
  extended by Facemorph.GMM

public class GMM
extends java.lang.Object

A Gaussian mixture model


Constructor Summary
GMM(int count)
          Constructs a GMM of count clusters
 
Method Summary
static double[] append(double[] s_frame, double[] a_frame)
          Method to concatanate two arrays
static java.util.Vector appendVectors(java.util.Vector v1, java.util.Vector v2)
          Method to concatanate each array of v2 onto the corresponding array of v1
 void copy(GMM gmm)
          Copy the given GMM
static java.util.Vector createDynamics(java.util.Vector samples)
          Creates estimates for the dynamic coefficents by concatenating each frame with the next
static java.util.Vector createDynamics2(java.util.Vector samples)
          Estimates the derivative of the appearance
 void display()
          Displays the GMM on the console
 java.util.Vector EM(java.util.Vector samples)
          Perform the EM algorithm to build a GMM on a number of samples
 void EMcluster(int iter, java.util.Vector samples)
          An experimental EM clustering algorithm
 java.util.Vector EMcluster(java.util.Vector samples)
          Attempt at automatically deciding on the number of cluster (doesn't really work)
 Gaussian getCluster(int i)
          Get the ith cluster
 int getCount()
          Gte the number of clusters
 double[] getRandomSample(java.util.Random rand)
          sample from one Gaussian in cluster selected according to its weight
 double getWeight(int i)
          Gets the weight of the ith cluster
 void initialiseAtSamples(java.util.Vector samples)
          Initialise this GMM to have one Gaussian centered on each sample with equal probability
static void main(java.lang.String[] args)
          Tests of GMM class
 double maxProbabilty()
           
 void merge(java.util.Vector sampleWeights)
          Attempt at merging clusters
 void normalise()
          Forces the sum of weights to equal 1.0
static void normalise(double[] vals)
          Forces the sum of weights to equal 1.0
static void normaliseSampleWeights(java.util.Vector sampleWeights)
          Normalise the sample weights to have sum of 1
static void normaliseSampleWeights2(java.util.Vector sampleWeights)
          Normalise the sample weights so that the most likely Gaussian has weight 1 and all the rest have weight zero
 double prob_sample_given_GMM(java.util.Vector sample)
          Calculates the probability of a set of samples
 double probability(double[] sample)
          Calculates the probability of the sample given
 void prune(double threshold)
          Remove clusters with a weighting less than threshold
 void random(java.util.Random rand, Gaussian gauss)
          Initialises a bunch of random centres
static int[] rank(double[] vec)
          Finds the ordering of a set of values
static void rank(double[] vec, int[] order, int p, int l)
          Find the ordering of a vector
 void read(java.io.DataInputStream in)
          Read from a DataInputStream
 void read(java.io.StreamTokenizer st)
          Read from a StreamTokenizer
 void read(java.lang.String filename)
          Read this GMM from a file
static java.util.Vector readMFCC(java.io.DataInputStream in)
          Read Mel-Frequency Cepstral Coefficients (MFCC) for audio analysis
static java.util.Vector readMFCC(java.io.StreamTokenizer st)
          Reads an MFCC file with most of the header stripped out
static java.util.Vector readMFCC(java.lang.String filename)
          Read Mel-Frequency Cepstral Coefficients (MFCC) for audio analysis
static java.util.Vector readVectors(java.io.DataInputStream in)
          Reads data vectors from an input stream
static java.util.Vector readVectors(java.io.StreamTokenizer st)
          Reads vectors using a StreamTokenizer
static java.util.Vector readVectors(java.lang.String filename)
          Reads a list of data vecors from a file
 void reestimate(java.util.Vector samples, java.util.Vector sampleWeights)
          Rebuild the GMM using the cluster weight for each sample
static java.util.Vector resampleVectors(java.util.Vector v, int N)
          Change the length of a 1D array using bi-linear interpolation
 void set(int i, Gaussian g, double w)
          sets the covariance, mean and weight of cluster i
 GMM slice(double[] sample)
          Makes a new GMM as sum of slices through Gaussians
 void weightSamples(java.util.Vector samples, java.util.Vector sampleWeights)
          Calculate the probability of each sample belonging to each cluster
 void write(java.io.PrintStream out)
          Write this GMM to a printstream
 void write(java.lang.String filename)
          Write a this GMM to a file named filename
static void writeVectors(java.util.Vector samples, java.io.PrintStream out)
          Write an array of vectors to a PrintStream
static void writeVectors(java.util.Vector samples, java.lang.String filename)
          Writes vectors to the file named
static void zeroSampleWeights(java.util.Vector sampleWeights)
          Set all the values in sampleWeights to zeo
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GMM

public GMM(int count)
Constructs a GMM of count clusters

Parameters:
count - the number of clusters to use
Method Detail

set

public void set(int i,
                Gaussian g,
                double w)
sets the covariance, mean and weight of cluster i

Parameters:
i - index of cluster to set
g - The Gaussian of the ith cluster
w - the weight of the ith cluster

getCluster

public Gaussian getCluster(int i)
Get the ith cluster

Parameters:
i - the index of the cluster to get
Returns:
return the ith cluster, or null if i>number of clusters

getWeight

public double getWeight(int i)
Gets the weight of the ith cluster

Parameters:
i - The index of the cluster to get
Returns:
The weight of the ith cluster, or 0 if i>weights.lenth

getCount

public int getCount()
Gte the number of clusters

Returns:
returns the number of clusters in this GMM

normalise

public void normalise()
Forces the sum of weights to equal 1.0


normalise

public static void normalise(double[] vals)
Forces the sum of weights to equal 1.0

Parameters:
vals - the values to normalise

write

public void write(java.lang.String filename)
Write a this GMM to a file named filename

Parameters:
filename - the name of the file to write to

write

public void write(java.io.PrintStream out)
Write this GMM to a printstream

Parameters:
out - The PrintStream to write to

read

public void read(java.lang.String filename)
Read this GMM from a file

Parameters:
filename - the name of the file to read from

read

public void read(java.io.DataInputStream in)
Read from a DataInputStream

Parameters:
in - The DataInputStream to read from

read

public void read(java.io.StreamTokenizer st)
Read from a StreamTokenizer

Parameters:
st - The StreamTokenizer to read from

slice

public GMM slice(double[] sample)
Makes a new GMM as sum of slices through Gaussians

Parameters:
sample - the values of the first n axes of this GMM
Returns:
A GMM that is a slice through this GMM in the hyper plane specified by sample

random

public void random(java.util.Random rand,
                   Gaussian gauss)
Initialises a bunch of random centres

Parameters:
rand - a Random number generator
gauss - a Gaussian with the distribution of the sample

EM

public java.util.Vector EM(java.util.Vector samples)
Perform the EM algorithm to build a GMM on a number of samples

Parameters:
samples - the vector of samples to fit a GMM to
Returns:
the sample weights in the final GMM

zeroSampleWeights

public static void zeroSampleWeights(java.util.Vector sampleWeights)
Set all the values in sampleWeights to zeo

Parameters:
sampleWeights - the sampleWeights to zero

normaliseSampleWeights

public static void normaliseSampleWeights(java.util.Vector sampleWeights)
Normalise the sample weights to have sum of 1

Parameters:
sampleWeights - the set of sample weights to normalsie

normaliseSampleWeights2

public static void normaliseSampleWeights2(java.util.Vector sampleWeights)
Normalise the sample weights so that the most likely Gaussian has weight 1 and all the rest have weight zero

Parameters:
sampleWeights - The sample weights to normalise

weightSamples

public void weightSamples(java.util.Vector samples,
                          java.util.Vector sampleWeights)
Calculate the probability of each sample belonging to each cluster

Parameters:
samples - the samples to analyse
sampleWeights - the probability of each cluster for each sample

reestimate

public void reestimate(java.util.Vector samples,
                       java.util.Vector sampleWeights)
Rebuild the GMM using the cluster weight for each sample

Parameters:
samples - The samples to put into the GMM
sampleWeights - The probability of each sample with respect to each Gaussian

initialiseAtSamples

public void initialiseAtSamples(java.util.Vector samples)
Initialise this GMM to have one Gaussian centered on each sample with equal probability

Parameters:
samples - the sample to use

EMcluster

public java.util.Vector EMcluster(java.util.Vector samples)
Attempt at automatically deciding on the number of cluster (doesn't really work)

Parameters:
samples - the sample to build this GMM for
Returns:
return the weights of each sample in this model

merge

public void merge(java.util.Vector sampleWeights)
Attempt at merging clusters

Parameters:
sampleWeights - the weight of each sample

rank

public static int[] rank(double[] vec)
Finds the ordering of a set of values

Parameters:
vec - the vector of values whose ordering we wisht ot ind
Returns:
the order from small to large

rank

public static void rank(double[] vec,
                        int[] order,
                        int p,
                        int l)
Find the ordering of a vector

Parameters:
vec - the vector to find the order of
order - the order of the vector's values
p - the start index of the sub array to sort
l - the end index of the sub array to sort

prune

public void prune(double threshold)
Remove clusters with a weighting less than threshold

Parameters:
threshold - the minimum weight for a cluster

getRandomSample

public double[] getRandomSample(java.util.Random rand)
sample from one Gaussian in cluster selected according to its weight

Parameters:
rand - A Random number generator
Returns:
Return a sample from this distribution

probability

public double probability(double[] sample)
Calculates the probability of the sample given

Parameters:
sample - The sample to calculate the probability for
Returns:
returns an estimate of its probability

display

public void display()
Displays the GMM on the console


readVectors

public static java.util.Vector readVectors(java.lang.String filename)
Reads a list of data vecors from a file

Parameters:
filename - the name of the file to read from
Returns:
returns the Vector of data vectors or null if not able to

readVectors

public static java.util.Vector readVectors(java.io.DataInputStream in)
Reads data vectors from an input stream

Parameters:
in - the input stream to read from
Returns:
returns the vectors read

readVectors

public static java.util.Vector readVectors(java.io.StreamTokenizer st)
Reads vectors using a StreamTokenizer

Parameters:
st - The StreamTokenizer to use to parse tokens
Returns:
Returns the vectors read

writeVectors

public static void writeVectors(java.util.Vector samples,
                                java.lang.String filename)
Writes vectors to the file named

Parameters:
samples - the vectors to write
filename - the name of the file to write them to

writeVectors

public static void writeVectors(java.util.Vector samples,
                                java.io.PrintStream out)
Write an array of vectors to a PrintStream

Parameters:
samples - the samples to write
out - the printstream to write to

readMFCC

public static java.util.Vector readMFCC(java.lang.String filename)
Read Mel-Frequency Cepstral Coefficients (MFCC) for audio analysis

Parameters:
filename - the name of the file to read from
Returns:
return the array of sample vectors

readMFCC

public static java.util.Vector readMFCC(java.io.DataInputStream in)
Read Mel-Frequency Cepstral Coefficients (MFCC) for audio analysis

Parameters:
in - the DataInputStream to read from
Returns:
return the array of sample vectors

readMFCC

public static java.util.Vector readMFCC(java.io.StreamTokenizer st)
Reads an MFCC file with most of the header stripped out

Parameters:
st - the StreamTokenizer to read from
Returns:
Returns the data samples read

append

public static double[] append(double[] s_frame,
                              double[] a_frame)
Method to concatanate two arrays

Parameters:
s_frame - The first array
a_frame - the second array
Returns:
the concatenation of the first and second arrays

appendVectors

public static java.util.Vector appendVectors(java.util.Vector v1,
                                             java.util.Vector v2)
Method to concatanate each array of v2 onto the corresponding array of v1

Parameters:
v1 - The first vector of arrays
v2 - the second vector of arrays
Returns:
the array of concatenated vectors

resampleVectors

public static java.util.Vector resampleVectors(java.util.Vector v,
                                               int N)
Change the length of a 1D array using bi-linear interpolation

Parameters:
v - the vector to resample
N - the desired new length of the vector
Returns:
the resampled vector

createDynamics

public static java.util.Vector createDynamics(java.util.Vector samples)
Creates estimates for the dynamic coefficents by concatenating each frame with the next

Parameters:
samples - the speech sample vectors
Returns:
the dynamic vector

createDynamics2

public static java.util.Vector createDynamics2(java.util.Vector samples)
Estimates the derivative of the appearance

Parameters:
samples - the samples to find the time deivative of
Returns:
return the time derivative

maxProbabilty

public double maxProbabilty()
Returns:
the maximum probability of this distribution

prob_sample_given_GMM

public double prob_sample_given_GMM(java.util.Vector sample)
Calculates the probability of a set of samples

Parameters:
sample - the samples in a Vector of double[]
Returns:
returns the sum of probabilities for each sample

copy

public void copy(GMM gmm)
Copy the given GMM

Parameters:
gmm - the gmm to copy

EMcluster

public void EMcluster(int iter,
                      java.util.Vector samples)
An experimental EM clustering algorithm

Parameters:
iter - The number of iterations to use
samples - The data samples to build a GMM for

main

public static void main(java.lang.String[] args)
Tests of GMM class

Parameters:
args - The command line arguments for main