Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this questionCan I detect blowing on a microphone with GStreamer (or another Linux-compatible sound library)?
I can get some informations about the sound doing that:
import gtk, gst
def playerbinMessage(bus, message):
    if message.type == gst.MESSAGE_ELEMENT:
        struct = message.structure
        if struct.get_name() == 'level':
            # printing peak, decay, rms
            print struct['peak'][0], struct['decay'][0], struct['rms'][0]
pipeline = gst.parse_launch('pulsesrc ! level ! filesink location=/dev/null')
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message', playerbinMessage)
pipeline.set_state(gst.STATE_PLAYING)
gtk.main()
I use this to detect clapping, but I don't know if I can use these informations to dete开发者_高级运维ct blowing without my computer confuses blowing and talking. Also, I don't know if there's another way to analyse sound with GStreamer or another Linux-compatible sound library.
You need to look at more than the audio level to distinguish between blowing and speech. For a start, consider that most speech consists of audio frequencies higher than about 80Hz, while blowing on the mic produces lots of low-frequency rumble.
So: if you want to stick to using gstreamer, maybe try using the "audiocheblimit" filter to low-pass the sound before measuring its level. (Something like audiocheblimit mode=low-pass cutoff=40 poles=4)
Personally, my approach would be more like:
- record the raw audio with something like python-alsaaudio
- compute the fourier transform of sound chunks using numpy
- sum up the amplitudes of low frequencies (20-40Hz, maybe) and trigger if this value is large enough.
If that didn't work, then I'd look for more clever detection algorithms. This approach (alsa+numpy) is very flexible, but a bit more complicated than the gstreamer approach.
edit: I just noticed gstreamer also has a "spectrum" element that will return the fourier transform.
Just a mix of answer and op code ( sample pipe )
#!/usr/bin/env python
import pygtk
pygtk.require('2.0')
import gtk, gst, time
class HelloWorld:
  def delete_event(self, widget, event, data=None):
      print "delete event occurred"
      return False
  def destroy(self, widget, data=None):
      print "destroy signal occurred"
      gtk.main_quit()
  def __init__(self):
      self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)
      self.window.connect("delete_event", self.delete_event)
      self.window.connect("destroy", self.destroy)
      self.window.set_border_width(2)
      #self.window.set_size_request(600, 483)
      """ Play """
      self.vbox = gtk.VBox(False, 2)
      self.vbox.set_border_width(0)
      self.hbox = gtk.HBox()
      self.hlpass = gtk.Entry()
      self.hlpass.set_text("low-pass")
      self.hbox.pack_start( gtk.Label("High/Low-pass: "), False, False, 0 )
      self.hbox.pack_start( self.hlpass, False, False, 0 )
      self.vbox.add(self.hbox)
      self.hbox = gtk.HBox()
      self.cutoff = gtk.Entry()
      self.cutoff.set_text("40")
      self.hbox.pack_start( gtk.Label("Cutoff: "), False, False, 0 )
      self.hbox.pack_start( self.cutoff, False, False, 0 )
      self.vbox.add(self.hbox)
      self.hbox = gtk.HBox()
      self.poles = gtk.Entry()
      self.poles.set_text("4")
      self.hbox.pack_start( gtk.Label("Poles: "), False, False, 0 )
      self.hbox.pack_start( self.poles, False, False, 0 )
      self.vbox.add(self.hbox)
      self.hbox = gtk.HBox()
      self.button = gtk.Button("High-Pass")
      self.button.connect("clicked", self.change, None)
      self.hbox.pack_start(self.button, False, False, 0 )
      self.vbox.add(self.hbox)
      self.window.add(self.vbox)
      self.window.show_all()
  def main(self):
      self.gst()
      gtk.main()
  def gst(self):
      test = """
      alsasrc device=hw:0 ! audioconvert ! audioresample ! audiocheblimit mode=low-pass cutoff=40 poles=4 name=tuneit ! level ! autoaudiosink
      """
      self.pipeline = gst.parse_launch(test)
      self.bus = self.pipeline.get_bus()
      self.bus.add_signal_watch()
      self.bus.connect('message', self.playerbinMessage)
      self.pipeline.set_state(gst.STATE_PLAYING)
  def playerbinMessage(self,bus, message):
    if message.type == gst.MESSAGE_ELEMENT:
      struct = message.structure
      if struct.get_name() == 'level':
        print struct['peak'][0], struct['decay'][0], struct['rms'][0]
        #time.sleep(1)
  def change(self, widget, data=None):
    data = [self.hlpass.get_text(), self.cutoff.get_text(), self.poles.get_text()]
    print data[0], data[1], data[2]
    self.audiocheblimit = self.pipeline.get_by_name('tuneit')
    self.audiocheblimit.props.mode = data[0]
    self.audiocheblimit.props.cutoff = int( data[1] )
    self.audiocheblimit.props.poles = int ( data[2] )
if __name__ == "__main__":
    hello = HelloWorld()
    hello.main()
Output low-pass:
-20.9227157774 -20.9227157774 -20.953279177
-20.9366239523 -20.9227157774 -20.9591815321
-20.9290995367 -20.9227157774 -20.9601319723
Output high-pass:
-51.2328030138 -42.8335117509 -62.2730163502
-51.3932079772 -43.3559607159 -62.2080540769
-52.1412276733 -43.8784096809 -62.9151309943
EDIT:
high-pass = speech and taking all audio
low-pass  = some audio like when you are talking near the microphone
The CMU Sphinx project http://cmusphinx.sourceforge.net/ is a toolkit for speech recognition and it can use gstreamer to provide a microphone stream. You can have a look.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论