Hello Gentlemen. I am new here, so preliminary apologies for any blunders. I would like to write a program to find an unknown pattern in a sequence of several bit streams of varying length. The bit streams are basically lines of binary in a text file that come from a bit of excel magic and an oscilloscope (and before you ask, my scope is cheap and coding this could be fun). So for example, if I was given a file with:
Find a pattern between the streams, align the streams, and take an average (as noise is a factor) discarding any "junk data" (like the stream of 111111 or 0000 at the front or end of the bit stream). It is possible that the program could be provided with a very rough approximation of what the stream looks like initially.
I have a few ideas, however I am by no means an expert programmer. Is there a reasonably efficient way to do this? (streams will be around 500 to 750 bits long)
Thanks in advance for your consideration of my current conundrum.
Pattern recognition is generally accomplished one of two ways: either by brute force checking for certain patterns (ie, you have a set of patterns in mind, and you want to check them one by one) or by artificial intelligence. The latter is orders of magnitude more complicated than the former.
So you need to clarify a bit. Are you just trying to splice a bitstream into N consecutive,
equal-sized subcomponents such that all subcomponents are exactly equal to one another,
"almost" equal to one another, or what?
I think you could try cross correlation which might involve array multiplication. just a guess from my DSP class some one might have a better idea. but i think you can try this
so average won't work.. it looks like if it's 1 only in the top two only it's zero, and if it's 1 only in the bottom two or 1 only in the top and bottom (not the middle), then it's 1, otherwise, if it's 1 only in one then it's 0 and if it's 1 in all three its 1 and 0 in all three 0...
if that is what you are looking for , then without really checking it, you could (if the first is x, second is y, and third is z, and A is the number you are looking for) do this:
A= ( (x&z) | (y&z) )
could do it for you
10101
11100
01011
--------
01001
if you want average (if there are at least two 1's then 1), you can do this
Hi again, this is ark488, not sure why but I can't login to that account...
Anyways, I think I feel the same way as jsmith. I had similar thoughts when I initially considered this problem. It's not something that can be entirely brute forced as the streams are far too long and there is no absolute cohesion between them.
The method I've been using so far requires some information about the bit stream. Knowing the preamble of the data, the streams can be aligned in time, which will align the data, at which point a simple average of the streams can be taken to get a reasonable approximation (and hopefully cancel out some noise in the streams).