This is only a preview of the September 1999 issue of Silicon Chip. You can view 34 of the 96 pages in the full issue, including the advertisments. For full access, purchase the issue for $10.00 or subscribe for access to the latest issues. Items relevant to "Autonomouse The Robot":
Items relevant to "Voice Direct Speech Recognition Module":
Items relevant to "Digital Electrolytic Capacitance Meter":
Items relevant to "An XYZ Table With Stepper Motor Control; Pt.5":
Purchase a printed copy of this issue for $10.00. |
Teach it to recognise YOUR voice!
VOICE DIRECT
Speech Recognition
By Ross Tester
As you are reading this, a spacecraft is speeding towards
Mars. It will make a “soft” landing in just a few weeks.
One of its scientific experiments will be to listen for, and learn
any sounds made on the planet. That task will be undertaken
by a very similar chip to that used in this project!
September 1999 35
SEPTEMBER 1999 35
S
ome projects appear so simple,
yet the underlying technology
is not only state-of-the-art, it’s
difficult to believe.
This is one such project: a voice
recognition module which actually
learns words, in your voice, and then
recognises them when asked to do so.
Think about that for a moment.
The chip is not simply recognising
an inbuilt vocabulary, though that is
no mean feat in itself.
It’s actually recognising words
which YOU teach it. The words don’t
have to be English. In fact, the words
don’t have to be words in the true
sense at all.
They can be sounds. They can even
be complete gibberish, just so long
as the chip can recognise them and
learn them.
That’s one of the reasons it was
chosen for the Mars project. Just imagine if they do find little green men
up there – they’re not likely to have
learnt the Queen’s English, are they?
Seriously, though, if there are
sounds to be heard, the chip will
learn them.
(If you’d like to know more about
the Mars Microphone probe, check
out the website www.sensoryinc.com/
html/mars.htm).
Now, back to this amazing technology and our project.
It’s all done with a purpose-designed module from Sensory, Inc, of
the USA. This module, measuring just
50 x 50mm and containing a couple of
ICs and a few surface-mount devices,
is capable of learning, and then recognising, virtually any word – with
Front (left)
and rear
(right) of the
Sensory Inc.
“Voice
Direct”
module,
shown
actual size.
a few provisos such as the words not
being too similar.
For instance, it shouldn’t have a
great deal of trouble with cat and cart
but cat and mat might give it some
angst. But more on this later.
All that is required is the connection of power (5V DC), an electret
microphone, a speaker and three
switches and the module is ready for
operation.
In this mode, shown in Fig.1, it
will ask you to say a word and then
repeat it. If you say the word the same
way twice, it will then ask you for
the next word, and so on, up to 15
words in total.
If you only want, say, three words,
you simply do not respond when it
asks for the fourth word and the learning mode is then terminated.
When you push the “recognise”
switch, it asks you to say a word. If
your word is in its vocabulary, it responds by sending one of its outputs
high. What you do with that output
is entirely up to you.
Fig.1: Sensory Inc’s suggested circuit which will recognise three words and
flash a LED. We used this as a starting point for our experimenter's board.
36 Silicon Chip
Just think of the applications: do
you want the TV set on, or the channel
changed? You can TELL the module
which channel you want!
Or you walk into a lift and instead
of pressing a button, you tell the lift
which floor you want. The lift then
whisks you to your floor! The possibilities of such a system are endless!
How about a robot which comes out
to serve you in the restaurant, recognising your order by what you say?
Gee, you could even send a system
into space and land on another planet
– Mars, for example. It could listen for
sounds, learn them, recognise them –
and maybe turn on a transmitter and
tell us back on earth. . . Oh, someone
has thought of that one already?
Yes, it really is out of this world!
The experimenter’s board
As you can see from the photographs and main circuit, we have developed an experimenter’s PC board
to go with the speech module. Let’s
explain why.
The user manual which accompanies the module suggests a circuit
which lights LEDs when it recognises
a word.
However, they only suggested three
LEDs. But we thought “there are 15
words, so why not 15 outputs?”
It isn’t quite that simple, because
only eight of the word outputs have
their own output pins. The last seven
require decoding, using combinations
of the other outputs. So we added a
couple of low-cost AND gates to provide all 15 word outputs.
The manual also suggested a 4.5V
power supply using three 1.5V batteries. But it also says that low battery
levels will degrade performance. For
the sake of a cheap 5V regulator and a
couple of capacitors, we sidestepped
that problem. You can use a 12V plug-
pack supply without worries.
Now that we had 12V available,
we thought it would be a good idea
to provide a relay output and driver,
just in case you had something you
wanted to control. Naturally, this (just
like the extra decoding) is entirely
optional – if you want to save a little
money, you can leave them off.
Some parts which aren’t optional, though, are the microphone, the
speaker and the three pushbutton
switches. These are, as you might
expect, essential for setting the mode,
teaching the module and recognising
the words.
Because we had now decided to put
all this on a printed circuit board, we
decided to mount the speaker and mic
on-board and use PC board-mounting
switches as well, making the whole
thing self-contained.
Now, how to mount the module
itself?
Sensory suggest using standard
0.1in header pins to make connection,
as all connections to the module are
brought out to holes on a standard
0.1in grid.
We took advantage of this to mount
the board by duplicating the rows of
holes, allowing the module to simply
drop over rows of header pins mounted on our PC board.
From there, it was a simple matter
to solder the pins to their appropriate
connections – it’s impossible to get
the connections wrong.
(But you can bridge between adjacent pins so you have to be very
careful and use a clean, fine-tipped
soldering iron).
As you can see, we also added two
extra rows of header pins to enable
connection between the word 1-8
outputs and the decoding circuitry.
As output 8 is always part of the
decoding, this was wired directly on
the PC board.
One other pair of header pins was
installed on two pads which are actually shorted out. This might seem
a little strange but these pins can be
used to select a “slave” mode of operation if the copper between them
is cut. Again, you may care to leave
these pins out.
All of the electronics are on a PC
board measuring 160 x 118mm. Power
connection to the board is made via
a 2.1mm DC power socket (so it will
suit most plug packs). Polarity is
the “standard” centre positive but
Above: the SILICON CHIP experimenter’s board ready for theVoice Direct module.
The lower row of LEDs light when words 1-8 are recognised. With suitable
connections the upper row of LEDs light with words 9-15.
Below: with the Voice Direct module fitted.
because there really is no standard, a
series diode will prevent catastrophes
if power is connected with the wrong
polarity.
A resistor and LED between +12V
and 0V will show that power is on and
also connected the right way around!
Construction
Start with the resistors. Once these
are soldered in place (particularly the
ones in series with LEDs), the chances
of making an error in placing other
components are significantly reduced.
Next, solder the 9 PC pins in place.
The two power diodes, small (bypass)
capacitors and the two wire links
complete most of the low profile
components.
Now move on to the semiconductors: all of the LEDs except the
“power on” LED are oriented with
their cathodes (marked by a flat on
the body and a shorter lead) towards
the edge of the board. How far down
you mount the LEDs is up to you – we
left about 5mm between the top of the
board and the LED body.
September 1999 37
You could solder them in anywhere
from flat down on the board to standing full length upright (the latter has
the advantage of being able to be used
again in another project).
The 5V regulator is mounted with
its flat side towards the edge of the
board and both electrolytic capacitors
have their negative sides towards the
edge also.
The two ICs have their notched end
(or the dot marking pin 1) towards the
middle of the board.
When mounting the switches,
ensure that their flat sides also go
towards the edge of the board, like
the LEDs. The relay cannot mount
incorrectly – it has eight pins and
they only fit one way.
Likewise, the DC power socket
must be correct with its three pins in
a triangular pattern and the opening
towards the outside of the board.
About all that are left are the microphone and speaker, along with the
header pins which mount the module.
First, the electret microphone insert. Take a look at the two pins on
the rear: one of the two is connected
to the case and this pin goes into the
hole closest to the regulator.
The speaker is mounted by two
short lengths of tinned copper wire
connecting the speaker terminals to
the appropriate PC pins. We made
our speaker more secure by putting
a dab of super glue gel on the back
of the speaker, securely holding it to
the PC board (double-sided adhesive
foam tabs would be just as effective).
Finally, we come to the header pins
which are used to mount the Voice
Direct module. We used two 25-pin
headers which gives 50 pins – exactly the number required to fill all the
holes on the board.
In truth, this is a bit of an overkill
because in this application only 22
of the pins are actually needed to
connect to the board.
However, we filled all 50 holes on
the base PC board with header pins
and soldered only those required – 19
along one edge and three adjacent.
Because the header pins come in
25-pin strips, you will have to cut
them (a pair of sharp sidecutters is
fine). Cut a 19-pin length from one
strip and an 18-pin length from the
other – these are for the parallel rows.
You will have a 6-pin and a 7-pin
length left from each strip; by sheer
fluke there are 13 pins required to
complete the set!
Push the short length of the header
pin block through the PC board and
very carefully solder the required
pins underneath using a fine-tipped
iron with a clean tip. Regularly clean
the tip on a wet sponge as you work
and don’t overheat the solder joint
as the pads are very small and could
easily lift.
Check, and check again, that you
haven’t bridged solder between any
two pins.
We didn’t bother soldering all 50
pins in place on the board, only the
required pins (ie, all 19 in one strip
and the three adjacent) and also the
end pins of each of the other header
pin sets, just to hold them in place.
An 8-pin length and a 7-pin length
of header pins are also needed to connect the chip outputs and decoding
lines.
We also used a 2-pin set on the
pads connected to the “stand alone”
pins but this is not necessary in this
application (we happened to have a
Fig. 2: our final experimenter’s board circuit. You can see the similarities between this circuit and the basic circuit
overleaf. We have added a power supply, outputs for all words and also some decoding for words 9 through 15.
38 Silicon Chip
HEADER PIN SET JP3 –
PINS 12-14 CONNECTED
HEADER PIN SET JP1 – NO CONNECTION
HEADER PIN SET JP2 – ALL CONNECTED EXCEPT 8 & 9
Fig 3: Here's how it all goes together on the printed circuit board. The bottom row of LEDs represents words
one to eight while the top row will decode nine to fifteen. The word output pins 1-7 need to be connected to
the appropriate word 9-15 input pins for decoding – see the separate logic table.
2-pin set spare, so why not?)
You could use a single header pin
(instead of a PC stake) in the base
circuit of the relay driver if you wish.
Before mounting the Voice Direct
module on the PC board, you might
like to confirm that the everything is
working correctly.
Plug in a 12V supply and ensure
that the power LED lights. Measure
the voltage between pins 4 and 5 of
header pin set JP2 of the voice module
(counting from the end closest to the
supply) and confirm it is 5V (pin 4
+5V, pin 5 0V).
Temporarily solder a length of insulated hookup wire to a point on the
+5V rail under the board and touch
the other end of this lead on each of
the header pins in the 8-pin set (word
1-8 outputs). Each of the LEDs should
light in turn. Do the same with the
7-pin row (word 9-15 inputs) and
each of those LEDs should also light.
Finally, touch the lead on the PC
stake (or header pin) connected to
the base circuit of the relay driver
and you should hear the relay click
in. If all is OK, disconnect power and
unsolder the wire. If all is not OK,
you’ll need to track down the fault
before proceeding.
There are four holes on the board
(above the switches) which at this
stage we’ll leave unfilled. They’re for
a more difficult learning mode, which
we’ll cover shortly.
Mounting the module
With the header pins in place, it
is simply a matter of dropping the
Voice Direct module over the pins
and sliding it down.
Because the holes on the module
are all plated through, it is quite possible that the module won’t even need
to be soldered in place (especially
useful if you wish to use the module
elsewhere).
But . . . Murphy’s law being what it
is, there is always a chance that one
or more pins won’t make contact so
we took the safe way out and soldered
the 22 required header pins to the
module. Again, a very fine, clean iron
is essential.
The project is now finished: now’s
the time to teach it some words!
Voice “training”
Apply power to the board and press
the “train” button. You should hear a
voice say “Say word 1”. Speak your
chosen word clearly into the microphone. The voice will say “repeat”.
Make sure you speak the same way
– that is, don’t change inflections or
emphasis because the module may
think you are saying a different word.
If it understands the word, it will
say “accepted” and ask you to “say
word 2”. You keep on repeating the
process until all 15 words are trained
or you have trained the number of
words required. If you want to train
six words, for example, simply do
nothing when it asks you for word
seven and it will respond with the
September 1999 39
Parts List
1 PC board, code 07109991,
160 x 118mm
1 Sensory Inc. “Voice Direct”
speech recognition module
1 2.1mm PCB mounting DC
power socket
3 25-pin 0.1in header pin sets
3 momentary action push
button switches, PC board
mounting
1 PC board mounting 12V relay,
DPDT contacts
1 electret microphone insert
1 8Ω speaker, 57mm
Semiconductors
16 5mm LEDs (any colour)
2 4081 quad AND gate ICs
1 BC337 NPN transistor
1 7805 voltage regulator
2 1N4004 power diodes
Capacitors
1 1000µF 25VW PC
electrolytic
1 10µF 12VW PC electrolytic
2 0.1µF MKT polyester,
monolithic or ceramic
Resistors (0.25W, 5%)
2 10kΩ
1 1kΩ
15 560Ω
Miscellaneous
9 PC stakes, hookup wire for links,
header cables for connecting
module outputs to decoder inputs
words “training complete”.
If a chosen word is too close to
another word, it will tell you. You
should try to avoid similar sounding
words. There is a way to ensure stricter training and recognition which we
will cover shortly.
If you have any difficulty getting the
module to recognise words, a few tips:
• Keep the same distance from the
microphone with the same voice
level when training and saying
words.
• Use a natural voice – while the
module will remember accents and
strange voices, you might not!
• The physical and emotional state
of the voice matters. For example,
if you’ve just run up a flight of steps
and are out of breath, your voice
will sound different than when you
are relaxed.
• In either training or recognition,
40 Silicon Chip
background noise may be a problem
– the module doesn’t know that the
background noise is not part of the
word! So take this into account.
Voice recognition
Press the “recognise” button and
you will be asked to say a word.
Say the word clearly. If the module
recognises the word it will respond
with the appropriate word number
and light the LED corresponding to
that number.
If it cannot recognise the word
because of incorrect pronunciation,
inflection or accent, it will say “word
not recognised”. If you say the word
too softly it will ask you to speak up.
If you say the word too quickly it will
tell you so!
As described, the module is set up
for “relaxed” training – it will recognise more words but may not be able
to differentiate between some words.
Another mode, called “strict” training
and recognition, is also available.
In this mode, where the train and
recognise lines are permanently
pulled to ground via a 100kΩ resistor, the module is harder to train,
it accepts less words but has better
accuracy in recognising words. Provision has been made on the PC board
for these 100kΩ resistors – they’re
the empty holes we referred to earlier above the train and recognise
switches.
For most purposes, the relaxed
mode is easier to use – in this case,
simply leave the 100kΩ resistors out.
Decoding
As mentioned before, words 1-8
cause a single output to go high
for about a second, lighting the appropriate LED. Words 9-15 need to
be decoded because they send two
outputs high – output 8 and another
of the 1-7 outputs, depending on the
word. Table 1 shows the output table.
In order to differentiate between
words 1-8 and 9-15, decoding is required. We have provided two 4081
quad AND gates on the board with
the module output 8 permanently
connected to one input from each gate.
Decoding, then, is simply a matter of
connecting a wire between the appropriate 1-7 output header pin and the
required decoder header pin.
This will cause a single LED to light
for outputs 9-15. Logically, word 9
would be the LED closest to the PC
board corner and word 15 would be
the LED closest to the speaker.
Relay output
A relay circuit is also provided to
give a “real world” output, with two
sets of changeover contacts.
Usage is simple: connect the relay
driver input to any of the required
word outputs and when that word
is recognised, as well as its LED
lighting the relay will pull in for the
same time.
This is about a second which
should be long enough to initiate some
further action. If not long enough, a
simple time delay can be added.
Note that we have not provided
any latching circuitry on the PC
board – if you want this, it can easily
be achieved by using one of the sets
of relay contacts to hold the relay on
once it is triggered.
TABLE 1: WORD RECOGNITION LOGIC
Word 1
Word 2
Word 3
Word 4
Word 5
Word 6
Word 7
Word 8
Word 9
Word 10
Word 11
Word 12
Word 13
Word 14
Word 15
Output 1 (header pin set JP2 - pin 12)
Output 2 (pin 13)
Output 3 (pin 14)
Output 4 (pin 15)
Output 5 (pin 16)
Output 6 (pin 17)
Output 7 (pin 18)
Output 8 (pin 19)
Output 8 (pin 19) AND Output 1 (pin 12)
Output 8 (pin 19) AND Output 2 (pin 13)
Output 8 (pin 19) AND Output 3 (pin 14)
Output 8 (pin 19) AND Output 4 (pin 15)
Output 8 (pin 19) AND Output 5 (pin 16)
Output 8 (pin 19) AND Output 6 (pin 17)
Output 8 (pin 19) AND Output 7 (pin 18)
Where do you get the Voice Direct Module?
At the time of going to press, we were
unable to determine if distribution has
been arranged in Australia. It is possible
that some suppliers will have stock shortly.
These days, though, there is no great
problem as it is possible to buy the module
direct from the USA using the Internet.
Sensory Inc. distributors Jameco Electronic Components (www.jameco.com)
or JDR Computer Products (www.jdr.
com) have a kit available for $US49.95
plus postage and handling.
This kit includes three tiny pushbutton
switches (not the same as the Jaycar ones
we used but they will fit the PC board), a
speaker, microphone insert and of course
the pre-assembled module itself.
Apart from the PC board, all the components on our experimenter's board are
commonly available. The PC board should
be available from the usual PC board
suppliers such as RCS Radio in Sydney.
For more information on the Voice Direct
module, including detailed data in Acrobat
format (PDF), visit the Sensory Inc website:
SC
www.sensoryinc.com
Following the retirement of our technical
draftsman (who has been with SILICON CHIP
since the first issue), we are looking for
someone with the right qualifications and
experience to take his place.
The person we are looking for must be able
to maintain the outstanding “look and feel”
of the circuit diagrams, PC board overlays,
drawings and diagrams which have become
synonymous with SILICON CHIP and have
contributed very much to its success and
respect in the marketplace.
Essential requirements for this position:
• An understanding of electronics, to
at least advanced hobbyist level – the
function, operation and requirements
of components and electronic circuitry.
• Practical experience in one or more
of the PC-based CAD, engineering
or drawing packages used today (we
use Generic CAD but other software
experience will be acceptable).
• The ability to interpret a variety of
original material and turn it into clear,
lucid diagrams.
• The ability to handle sometimes very
tight deadlines with accuracy, clarity
and thoroughness.
• The ability to work as part of a small,
busy team.
• The ability to commence yesterday!
Other qualifications and experience
which would be well regarded:
• Experience in Internet web page
design and construction.
• The ability to design and produce
electronic projects of the type which
appear in SILICON CHIP.
• Possibly technical writing expertise.
SILICON CHIP offices are located at Mona
Vale on Sydney’s Northern Beaches.
This is the full-size PC board pattern for the project. Because of the close
spacing of tracks (especially around the header pin sockets) copying this from
the magazine is not really a proposition. Use it instead to check commercially
obtained boards. Of course, the PC board pattern is available from the SILICON
CHIP website, www.siliconchip.com.au
If you can satisfy all, or most of these
requirements, please contact Leo
Simpson, Publisher, SILICON CHIP with
your CV as soon as possible in one of
the following ways:
email: silchip<at>siliconchip.com.au
Fax: (02) 9979 5644
Mail: PO Box 139, Collaroy NSW 2097.
September 1999 41
|