Vishal’s Summer Research

Some assembly required. But hopefully not MIPS.

A News Article About Me

Hi all. Calit2 has published a news article about me. The link is below:

http://calit2.net/newsroom/article.php?id=1372.

I will have more later, so stay tuned!

August 28, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Adventures in the Smart Spaces Lab.

Introduction

Hello all. It’s been awhile since I’ve done any documentation, so I will go ahead and start by talking about the actual setup in the Smart Spaces Lab here at Calit2. I’ll post the results I’ve been getting in another post.

The Technology

First, I’ll go into detail about the sound equipment I am using.

I’m using production-quality digital microphones for my project. Here’s a photo of one:


I’m using 16 of them to make 4 clusters of 4 microphones. Here’s a shot of them.


The microphones will connect to a sound card, shown below. (It’s the black box.)


The three steel boxes below it are called pre-amplifiers. Their job is to clean the signal from the microphones and send it to the sound card. The sound card then sends the signal to a computer, shown here:


For those of you who are engineering-inclined, here’s a schematic showing all the connections.


Building It

It’s time to roll up our sleeves and have at it. First order of business is the extension cords. The microphone cable is about 5 feet long, but we’ll need a lot more wire than that for our experiment. But to avoid a tangled and potentially damaging mess, some preparation is required. That means turning this…


…into this.


You might notice that I have color coded and labeled every connector. With 16 connections, this is essential, for organization. We will be mounting these clusters using a speaker stand:


You can see that the stand has a strange-looking piece of aluminum on it. This piece is part of something called the Industrial Erector Set.


The industrial erector set is just that–a set for building things.


We’ll be building clusters with this set.


Here’s a picture of the final product, waiting to be connected to our sound card.


I have put foam caps over the microphones, to make sure they fit snugly and that they are insulated from any vibrations in the aluminum.

However, I found that the spacing in these clusters was too small, and I couldn’t measure the time delay, due to a resolution effect, which I will explain in another post. I rebuilt the cluster to have a much wider spacing.


Currently, I’m running tests on the effectiveness of this new cluster at forming vectors towards a source. My source of sound is a PC speaker connected to my laptop.


From the desktop computer, I remotely play a music file on the laptop, and record the sounds the microphones pick up using a program called Adobe Audition. Here’s what a recording looks like


And that’s what I’m up to for now. See you later!

August 9, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

My Experiment Setup: The Non-Technical Version

Introduction

This is the layout for my experiment. As you might remember, I am working on an acoustic localization project. By acoustic localization I mean the tracking of an object by the sound it makes. I’ve adapted the post from my own blog to make it friendly to even non-engineers. ;)

The Goal

We aim to test a localization network using a pseudo-testbed. When I say testbed, I mean a system that tests an idea (in this case, my theory of localization). When I say pseudo-testbed, I mean that the testbed can’t actually localize the sound; it just collects data for me to work with. The idea is to see how noise in the room will affect my localization theory. The experiment consists of 2 parts: Record, and Process.

Part 1: Record


While this flow chart might look a little scary, it’s easy to understand. The orange dots represent microphones. I want a setup with 16 microphones arranged in 4 clusters, with 4 microphones per cluster. Each microphone will have its own channel of audio on a sound card. Most people know 2-channel audio as stereo:

I’ll need 16 channels of audio to record sound from my 16 microphones. The reason why I can’t do it with fewer channels is the same reason why you can’t fit 7 sodas into a six-pack—it simply won’t fit. Each microphone needs its own channel in the same way each bottle in the six-pack needs its own slot.

We connect our 16 microphones to 16 channels on a sound card, like the one below:


I will set up the sound card to output a computer file for every cluster. The file will have 4 channels, which will correspond to the individual members of the cluster.

Part 2: Process


Though this flowchart might look even scarier than its sibling, it’s very simple to explain. I open a program in MATLAB, which will read all my 4 files, and perform all the calculations required to get the location of the source. You might say that this process is similar to a high school physics lab, where you copy down all the data in the lab, and then figure out if your experiment worked (or didn’t work) the night before the lab is due . . . uh, I mean, after you collect the data. Hopefully I won’t get anything mixed up.

Technical Layout

I will be recording in the Smart Spaces Lab on the 5th floor of Calit2. The space available for testing is 90″ wide, 173″ deep, and 120″ in height. It’s a “glass house”- meaning that there are no actual walls, just tape marks on the floor. This will help us as we will not have to worry as much about echoes created when the sound bounces off real walls.

I will place microphones at the midpoints of the rectangle bounding the room for my first configuration. The overhead view should look like the following:


Each of the blocks represents a speaker stand outfitted to carry 4 microphones in a cross pattern.


I’ll need to record about 5 sets of data to get a good idea of how my method is working. I’m also interested in trying out other microphone setups, to see how they handle the noise.

Conclusion

This method isn’t as exciting as a real-time test bed, but analytically it’s a better setup, because it allows me to reprocess the same data in the same conditions multiple times, ensuring consistency. Please, if you have any questions, comment, or email me at vishal@ucsd.edu. See ya!

July 24, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | 2 Comments

Experiment Setup

Introduction

We aim to test a localization network using a pseudo-testbed. While it cannot localize in real-time, it will provide more meaningful data about noise and the impact it can have on localization, and will give us a better idea of performance.

The following flowcharts explain how the system works:


The orange dots represent omni-directional microphones. I want a setup with 16 microphones arranged in 4 clusters, with 4 microphones per cluster. Each microphone will have its own channel of audio on a sound card, like the one below:


I will set up the sound card to output a .wav file for every cluster . The .wav file will have 4 channels, which will correspond to the individual members of the cluster. The rest of the process follows the following flowchart:


We open the .wav files in MATLAB, and operate on them using a simulator, which creates a source estimate. We can implement any simulator with the exact same data (.wav files), meaning we can run as many simulations as we want.

Technical Layout

I will be recording in the Smart Spaces Lab on the 5th floor of Calit2. The space available for testing is 90″ wide, 173″ deep, and 120″ in height. It’s a “glass house”- meaning that there are no actual walls, just tape marks on the floor. This will help us as we will not have to worry as much about reverberation.

I will place microphones at the midpoints of the rectangle bounding the room for my first configuration. The overhead view should look like the following:


Each of the blocks represents a speaker stand outfitted to carry 4 microphones like so:


I will make at least 5 separate recordings with varied source locations to test the method with, then try other localization schemes.

Conclusion

This method isn’t as exciting as a real-time test bed, but analytically it’s a better setup, because it allows me to reprocess the same data in the same conditions multiple times, ensuring consistency.

July 23, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Project Reflection

I figure I should document my achievements up to this point.

What Works

  • TDOA Simulator. I have built a simulator which will define a room with 4 microphone clusters. The simulator will then randomly pick a source location, and then calculate the time delays the microphone pairs would pick up. Then, each cluster generates a vector back towards the source. The measurements the cluster uses can be injected with random Gaussian noise if so desired.
  • PCI Functions. I implemented Brandstein et al’s method for finding points of closest intersection. The function creates the over determined system I described in my notes, and then uses the least-squares method to solve for the desired parameters. I have called this “open-form” because, well, it’s not a closed-form solution. That is to say, this method is not a general solution I can plug in values to and get an answer. Having a closed-form equation is desirable for real-time situations, because it’s easier to process. UPDATE: I now have the closed-form equation. It is a little too much to go into here, but I will post up notes on it soon.
  • Coordinate Transformation. I have an algorithm that can translate vectors from the microphone’s coordinate system to the room’s coordinate system. However, this algorithm only works for rooms that are perfectly rectangular. That is to say, a room with only four walls that meet at 90-degree angles. Though this is the kind of room that is a Cartesian coordinate system implies, I’ve yet to see a room that is perfectly rectangular. Most rooms have some sort of taper, projection into the room, alcove, etc, that prevent it from being perfectly rectangular.
  • Centralized Method. I have built an add-on to the TDOA simulator that emulates the setup used by Brandstein et al. This simulator emulates the ML estimator, but it limits the field of search to the points of closest intersection (PCI ) of the lines. This simulator is fairly robust with “noise” of +/- 0.01 seconds–impressive when the time delay is on the order of .0001 seconds.
  • Orthogonal Projection Method . I have the algorithm that can find the intersection of lines using the orthogonal projection method I showed in my previous notes. This method can function both with and without noise. To force a convergence to a single point, the algorithm will apply a “relaxation term” which will limit the contribution individual members of the ring make to the final estimate.

    What Needs Work

  • New Microphone Geometries. To simplify the process of calculating vectors, Brandstein et al fixed the microphones to a cross-shaped geometry. While this does seem to work fairly well, It’s hardly enough for me to rehash the contents of an old paper with a new estimator. What’s to stop us from having 8 microphones arranged in a circle. To that point, what about a ring of microphones? What’s the formula for getting the vector when you have N microphones arranged in a certain pattern (ring, sphere, cube, etc.) This will be tough, but this is what will make my research mean something.
  • Developing the Wireless Network. I will need to conceive the network’s workings. I have a simulator that lays out what the distributive network does, but not how the network goes about accomplishing it. Things I will need to consider include efficiency of the network, “race conditions” (where the output changes depending on which input gets there first), how to handle things such as ring convergence, and how to report the final estimate. The trick is to make maximum utility of the hardware while keeping the network under control. The collaborative network has had mention in multiple papers, and Paolo has implemented it before. The only issue is that in papers, people are more interested in theory rather than implementation, so it’s hard to avoid reinventing the wheel.
  • Hardware/Testbed. I need to find a space where I can conduct experiments, and get hardware that will do the job. There is a smart space available on the 5th floor of Calit2 that can conduct these experiments. The issue on hand is figuring out how to get the information from 4+ microphones to a computer for analysis. Though up until now Paolo has been taking the lead, I will need to take the initiative on this. I have a list of contacts ready, and will need to figure out the most cost-efficient way to build the needed hardware (if it needs to be built at all). If I’m lucky, I’ll be able to borrow existing hardware and avoid the issue altogether.
  • Abstract. I need to develop my abstract and title for the Undergraduate Research Conference in August. This will take some conversing with Paolo, and will (hopefully) not take too long.

     
     

July 16, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Orthogonal Projection vs PCI

I have completed simulators which emulate the performance of a centralized localization scheme (where all microphone clusters communicate with a single computer) and that of a distributive localization scheme (where microphones collaborate).

I also created a statistics test bench which shows the performance of the algorithm.


The red line shows the centralized scheme of localization, and the blue one shows the distributive scheme of localization. Bias is the distance between the source location and the scheme’s estimate. Because my distributive scheme is iterative, I also kept track of the number of estimates it took for the method to converge. Each point of data on this graph is the average of 100 trials.

The x-axis gives the maximum error in the time-delay measurement.

The graphs tell us 3 things. First, we know the limit of the noise that either system can manage with. Both systems fall apart at around the same level of noise. Second, since the curves are for the most part on top of each other, the two methods are comparable (very good news). Third, it will take anywhere between 18-20 estimates to ascertain a solution. This means, for a 4-sensor network, each sensor makes 4 or 5 calculations–making this method very fast. There is also no need to fiddle with a random variable, unlike with the centralized method.

Here’s where things get interesting. My question now is this: What happens when we try a new microphone geometry? We’ve been working so far with this:


What happens when we try something like this?


With 8 microphones in a cluster, we can no longer use the notion of cones to come up with the intersection. Now we have at least 4 cones to contend with. With noise, we’d need an estimator just to get the vector.

But what if we can find another way to do it? Thinking back on what the time delay lets us do:


We have an arc of all possible directions, which ranges from 0 to 180 degrees. The black vector is the vector given from the time delay. The green and orange vectors compose the black vector. If we can come up with a scheme that properly translates these vectors, we’ll have no need for a cone.

And so begins the next task.

 
 

July 14, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Orthogonal Projection – Resolved


I’ve come up with a resolution to the ring convergence problem. Will post the formula later.


MATLAB report:

13-Jul-2008 17:17:33

———————

Source Location

[0.059564,1.0961,8.1876]

Ring convergence found

Applying relaxation term.

Estimate=[0.055027,1.113,8.1881]

Statistics:

Injected Gaussian Noise with mean=0 and variance=0.001

Estimates Needed=45

Bias=0.00030764

July 14, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Orthogonal Projection Method

Today, we’ll see how to find the intersection of 2 lines using orthogonal vectors. This is key to understanding the collaborative method of localization.

Imagine we have 2 lines that intersect.


These lines are assumed to go on in infinity. All we know about these lines is their starting point and what direction they go in. In our example, the starting points are x1 and x2, and the direction is given by v1 and v2, respectively. First, we find the vector that goes from x1 to x2. We will call this vector vd.


Next, we project this vector vd onto the line starting at x1. The projection becomes the new x1.


We notice that the new x1 is closer to our point of intersection than the old x1. The points x1 and x2 should be seen as estimates of the point of intersection. As we keep repeating this process, our estimates will get better and better:


Now, x2 is almost at the point of intersection. With another 2 rounds through this cycle, we should be able to estimate the intersection down to +/- 0.001 units. When the length of vd (the distance between x1 and x2) becomes small enough, we stop the process.

We can use this for lines that intersect in 3D space too. Here’s a picture.


A closeup shows the two estimates getting more and more refined.


These pictures come from a testbench I set up in MATLAB. Here’s the text report.

Initial

3.6228 2.1775 1.9881

Estimate

3.6228 2.1775 1.9881

Bias=1.627e-030

Bias is the difference between the actual intersection and my estimation.

As you can see, it’s pretty accurate, and it will work for any number of lines, too:


This is a simple way to collaboratively find the intersection of two points. All a cluster needs is a point on a neighboring line, and it can then project it onto its line. It can then pass its projected point onto the next line. The only problem is that when the lines don’t intersect, you will end up with something called ring convergence. In the picture above, the lines intersect perfectly, so the method will converge to a single point. However, with ring convergence, the method will converge to a ring of points:


As you can see, there is a ring convergence with 3 points.

This is the challenging part of the process, and I’m working on applying techniques to deal with this issue. Once I can get my simulator to account for this and converge to a single point, I’ll make a new post.

 
 

July 13, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | 1 Comment

LI Implementation: Complete!

I have finished implementing the LI estimator. I added noise in the form of additive zero-mean Gaussian noise with an adjustable variance. Here are some reports from MATLAB:

 
 

Report 1

Injected Gaussian Noise with mean=0 and variance=0.001

Final Estimate=[5.5941,3.7294,4.1864]

Source Location=[5.5909,3.728,4.1865]

Bias=1.2495e-005


Report 2

Injected Gaussian Noise with mean=0 and variance=0.01

Final Estimate=[5.1913,4.0379,2.1349]

Source Location=[5.0773,4.2012,2.0265]

Bias=0.051423

 
 


A closer look shows all points of closest intersection, the actual location of the source, and our final estimate.


This marks the end of my work exploring centralized methods. I now turn my attention to distributive sensor networks. Stay tuned!

July 3, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet

Coordinate Change

Introduction

To simplify things in the LI estimator, we use a relative coordinate system like the one below.


However, we localize the source in a coordinate system relative to the room, which looks like this:


We need to have an algorithm that lets us change the coordinate system appropriately.

Coordinate Types

We can make the following assumptions about our microphone arrays:

  • Microphones will never be at the origin of our relative system
  • Microphones in a cluster will all lie in the same plane. This plane will be parallel to the xz plane or yz plane.
  • The origin of a cluster will never be on an axis of a room.

With these assumptions, we can say that for a room with a width W and length L, the coordinates of the cluster will come in four types.

Type Coordinates
I (x, 0, z)
II (0, y, z)
III (x, L, z)
IV (W, y, z)

Here’s an illustration of all 4 types:


This is important, because it guides how we do our transforms.

Step 1: Rotation

It’s possible that the line connecting m1 and m2 is not parallel to the x or y axis. To take this into account, we need to measure the angle the line has relative to the correct axis in the room coordinate system, and then rotate our cluster coordinate system by that amount about its z-axis. The formula for getting the angle is:


Where a and b are determined by the following table:

Type a b
I x z
II y z
III x z
IV y z

We then use our value of θ to rotate our coordinate axes. This is done by multiplying the vectors by a rotation matrix. This is something that can be done easily in MATLAB, so I won’t go into depth on it here. What I will say is that we have a MATLAB function:

R1 = rotate_rel_abs(M1, [ origin; kappa, theta, phi ]);

The function rotates the coordinate system through the three angles κ, θ, and φ, and translates it to the point given by the vector origin. The result is that M1 gets mapped to a corresponding vector R1 in a new coordinate system.

This first rotation is simple. The code for rotating a point M1 is as follows:

M1_r = rotate_rel_abs(M1, [0 0 0; 0 theta; 0]);

Step 2: Translation

We now need to take our point relative to our rotated microphone axes and translate it appropriately. We’ll use the following table to make the translation:

Type κ θ φ Additional Transform
I 0 0 y = -y
II 0 N/A
III 0 0 N/A
IV x = 2L – x

Conclusion

We can summarize the above in the following MATLAB code. Feel free to use it, but remember to cite me.

Matlab functions to perform coordinate transformation

July 3, 2008 Posted by Vishal Kotcherlakota | Uncategorized | | No Comments Yet