Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

voice_id

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

voice_id

  • 0.1.2
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

Gem Version CircleCI

VoiceId

Wrapper around Microsoft Cognitive Services - Speaker Recognition API

Installation

sign up and pick up a new api key (speaker recognition API key) https://www.microsoft.com/cognitive-services

$ gem install voice_id

Examples

  #create a new profile
  identification = VoiceId::Identification.new("MS_speaker_recognition_api_key")
  profile        = identification.create_profile
  # => { "identificationProfileId" => "49a46324-fc4b-4387-aa06-090cfbf0214f" }

  # create a new enrollment for that profile
  profile_id    = profile["identificationProfileId"]
  path_to_audio = '/path/to/some/audio_file.wav'
  short_audio   = true
  operation_url = identification.create_enrollment(profile_id , short_audio, path_to_audio)
  # => "https://api.projectoxford.ai/spid/v1.0/operations/EF217D0C-9085-45D7-AAE0-2B36471B89B5"

  # check the status of operation
  operation_id = identification.get_operation_id(operation_url)
  # => "EF217D0C-9085-45D7-AAE0-2B36471B89B5"

  identification.get_operation_status(operation_id)
  # notice below that we only had 13.6 seconds of useable audio so we need to
  # submit more enrollments for this profile until we achieve at min 30 seconds
  # => 
  #   {
  #    "status" => "succeeded", 
  #    "createdDateTime" => "2016-09-23T01:34:44.226642Z",
  #    "lastActionDateTime" => "2016-09-23T01:34:44.4795299Z",
  #    "processingResult" => {
  #      "enrollmentStatus" => "Enrolling", 
  #      "remainingEnrollmentSpeechTime" => 16.4, 
  #      "speechTime" => 13.6, 
  #      "enrollmentSpeechTime"=>13.6
  #    }
  #   }

  # identify a speaker
  profile_ids                  = ["49a46324-fc4b-4387-aa06-090cfbf0214f", "49a36324-fc4b-4387-aa06-091cfbf0216b", ...]
  path_to_test_audio           = '/path/to/some/audio_file.wav'
  short_audio                  = true
  identification_operation_url = identification.identify_speaker(profile_ids, short_audio, path_to_test_audio)
  # => "https://api.projectoxford.ai/spid/v1.0/operations/EF217D0C-9085-45D7-AAE0-2B36471B89B6"
  identification_operation_id = identification.get_operation_id(identification_operation_url)
  # => "EF217D0C-9085-45D7-AAE0-2B36471B89B6"
  identification.get_operation_status(identification_operation_id)
  # => 
  # {
  #  "status" => "succeeded", 
  #  "createdDateTime" => "2016-09-23T02:01:54.6498703Z",
  #  "lastActionDateTime" => "2016-09-23T02:01:56.054633Z",
  #  "processingResult" => {
  #    "identifiedProfileId" => "49a46324-fc4b-4387-aa06-090cfbf0214f", 
  #    "confidence"=>"High"
  #   }
  # }

APIs

Provides methods for two APIs (Identification and Verification) All audio samples provided to the API must be the following format:

Container WAV
Encoding  PCM
Rate  16K
Sample Format 16 bit
Channels  Mono

Identification API

Identify a person from a list of people - this is a text-independant api. Prior to being able to identify a speaker, a speaker (profile) must send a minimum of 30 seconds of recognizable audio.

identification = VoiceId::Identification.new("MS_speaker_recognition_api_key")
create_profile

Each person needs a unique profile, this creates a new one.

  profile = identification.create_profile
  # => { "identificationProfileId" => "49a36324-fc4b-4387-aa06-090cfbf0064f" }
create_enrollment(profile_id, short_audio, audio_file_path)

An enrollment is how audio samples are associated with a profile (training the service). For the Identification API a minimum of 30 seconds of recognizable speach is required. This can be done through multiple enrollments. This creates a new enrollment for a profile.

  profile_id    = "1234567890"
  path_to_audio = '/path/to/some/audio_file.wav'
  short_audio   = true # true - set minimum duration to 1 sec (5 sec by default per enrollment)
  identification.create_enrollment(profile_id, short_audio, path_to_audio)
  # returns an operation url that you can use to check the status of the enrollment
  # => "https://api.projectoxford.ai/spid/v1.0/operations/EF217D0C-9085-45D7-AAE0-2B36471B89B5"
get_operation_id(operation_status_url)

Certain endpoints take time to process to they return a url for you to check on the status of the operation. To parse out the operation id use this method. Now you can use #get_operation_status to check the id.

  operation_status_url = identification.create_enrollment(profile_id, short_audio, path_to_audio)
  # => "https://api.projectoxford.ai/spid/v1.0/operations/EF217D0C-9085-45D7-AAE0-2B36471B89B5"
  identification_operation_id = identification.get_operation_id(operation_status_url)
  # => "EF217D0C-9085-45D7-AAE0-2B36471B89B6"
get_operation_status(operation_id)

Check on the status of an operation by passing in the operation id (use #get_operation_id to get the id)

  identification.get_operation_status(identification_operation_id)
  # => 
  # {
  #  "status" => "succeeded", 
  #  "createdDateTime" => "2016-09-23T02:01:54.6498703Z",
  #  "lastActionDateTime" => "2016-09-23T02:01:56.054633Z",
  #  "processingResult" => {
  #    "identifiedProfileId" => "49a59333-ur9d-4387-wd06-880cfby0215f", 
  #    "confidence"=>"High"
  #   }
  # }
delete_profile(profile_id)

Delete a particular profile from the service.

  profile_id = "1234567890"
  identification.delete_profile(profile_id)
  # => true
get_all_profiles

Returns a list of all the profiles for this account.

    identification.get_all_profiles
    # => 
    # [
    #   {
    #     "identificationProfileId" => "111f427c-3791-468f-b709-fcef7660fff9",
    #     "locale" => "en-US",
    #     "enrollmentSpeechTime" => 0.0
    #     "remainingEnrollmentSpeechTime" => 0.0,
    #     "createdDateTime" => "2015-04-23T18:25:43.511Z",
    #     "lastActionDateTime" => "2015-04-23T18:25:43.511Z",
    #     "enrollmentStatus" => "Enrolled" //[Enrolled | Enrolling | Training]
    #   }, ...
    # ]
get_profile(profileId)

Returns a profile's details

  profile_id = "1234567890"
  identification.get_profile(profile_id)
  # =>
  #   {
  #     "identificationProfileId" => "111f427c-3791-468f-b709-fcef7660fff9",
  #     "locale" => "en-US",
  #     "enrollmentSpeechTime" => 0.0,
  #     "remainingEnrollmentSpeechTime" => 0.0,
  #     "createdDateTime" => "2015-04-23T18:25:43.511Z",
  #     "lastActionDateTime" => "2015-04-23T18:25:43.511Z",
  #     "enrollmentStatus" => "Enrolled" //[Enrolled | Enrolling | Training]
  #   }
reset_all_enrollments_for_profile(profileId)

Resets all the enrollments for a particular profile

  profile_id = "1234567890"
  identification.reset_all_enrollments_for_profile(profile_id)
  # => true
identify_speaker(profile_ids, short_audio, audio_file_path)

Identify a speaker by calling this method with an array of enrolled profile_ids. Use short_audio to wave the required 5-second speech sample. The audio sample to be analyzed should ideally be 30 seconds, with a maximum of 5 mins.

  profile_ids        = ["49a46324-fc4b-4387-aa06-090cfbf0214f", "49a36324-fc4b-4387-aa06-091cfbf0216b", ...]
  path_to_test_audio = '/path/to/some/audio_file.wav'
  short_audio        = true
  operation_url      = identification.identify_speaker(profile_ids, short_audio, path_to_test_audio)
  # => "https://api.projectoxford.ai/spid/v1.0/operations/EF217D0C-9085-45D7-AAE0-2B36471B89B6"
  identification_operation_id = identification.get_operation_id(operation_url)
  # => "EF217D0C-9085-45D7-AAE0-2B36471B89B6"
  identification.get_operation_status(identification_operation_id)
  # => 
  # {
  #  "status" => "succeeded", 
  #  "createdDateTime" => "2016-09-23T02:01:54.6498703Z",
  #  "lastActionDateTime" => "2016-09-23T02:01:56.054633Z",
  #  "processingResult" => {
  #    "identifiedProfileId" => "49a46324-fc4b-4387-aa06-090cfbf0214f", 
  #    "confidence"=>"High"
  #   }
  # }

Verification API

Verify that a person is who they say they are - this is a text-dependent api. Prior to being able to verify a speaker, a speaker (profile) must send three audio samples (from an API provided list) with their enrollment.

verification = VoiceId::Verification.new("MS_speaker_recognition_api_key")
list_all_verification_phrases

Get a list of accepted scripts to use when sending your audio sample.

  verification.list_all_verification_phrases
  # =>
  #  [
  #   {"phrase" => "i am going to make him an offer he cannot refuse"}, 
  #   {"phrase" => "houston we have had a problem"}, 
  #   {"phrase" => "my voice is my passport verify me"}, 
  #   {"phrase" => "apple juice tastes funny after toothpaste"}, 
  #   {"phrase" => "you can get in without your password"}, 
  #   {"phrase" => "you can activate security system now"}, 
  #   {"phrase" => "my voice is stronger than passwords"}, 
  #   {"phrase" => "my password is not your business"}, 
  #   {"phrase" => "my name is unknown to you"}, 
  #   {"phrase" => "be yourself everyone else is already taken"}
  #  ]
create_profile

Same as Identification API

create_enrollment(profile_id, audio_file_path)

Requires 3 enrollments. Pick 3 of the acceptable phrases from #list_all_verification_phrases and enroll them.

  verification.create_enrollment("49a46324-fc4b-4387-aa06-090cfbf0214f", '/path/to/audio/make_him_an_offer.wav')
  # =>
  #   {
  #     "enrollmentStatus" => "Enrolling",
  #     "enrollmentsCount" => 1,
  #     "remainingEnrollments" => 2,
  #     "phrase" => "i am going to make him an offer he cannot refuse"
  #   }
delete_profile(profile_id)

Same as Identification API

get_all_profiles

Same as Identification API

get_profile(profile_id)

Same as Identification API

reset_all_enrollments_for_profile(profile_id)

Same as Identification API

verify_speaker(profile_id, audio_file_path)

User (profile) would have had to enroll with 3 of the accepted phrases (#list_all_verification_phrases). Once the phrases have been accepted, a recording of one of the accepted phrases can be checked against an enrolled profile.

  verification.verify_speaker("86935587-b631-4cc7-a59t-8e580d71522g", "/path/to/audio/offer_converted.wav")
  # =>
  #   {
  #    "result" => "Accept", 
  #    "confidence" => "High", 
  #    "phrase" => "i am going to make him an offer he cannot refuse"
  #   }

FAQs

Package last updated on 13 Feb 2017

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc