In this demonstration, we showcase one of the Breadboard project’s boards, specifically a Hugging Face board designed for audio transcription. This board leverages the Hugging Face Inference API to convert audio files into text transcripts.

Find out more about speech recognition with the Hugging Face Inference API here.

WARNING

Because the core kit stringifies the request body, this wont work inside of the board or Breadboard Web.

However it can still be run on CLI as it demonstrates the expected board functionality using built in fetch function.

Input Parameters

  • File Name: The name of the audio file to be transcribed.
  • API Key: The Hugging Face Inference API key necessary for accessing the transcription service.

Demonstration Steps

  1. Prepare the Audio File: Ensure you have the audio file ready for transcription. For this demonstration, we use a sample audio file with the following content:
Hello. I am Google Translate. Please make a transcript of what I am saying.
  1. Run the Board: Execute the board with the specified audio file and API key. The board processes the audio and returns the transcribed text.

Output

Upon running the board, the output is as follows:

Hello, I am GUGAL Translate. Please make a transcript of what I am saying.

This demonstrates the accuracy of the transcription, with room for improvements in future iterations.

Future Integration

There are plans to integrate this functionality with the Chrome Breadboard extension, although this is still a work in progress.

The chrome extension would either accept file uploads or allow audio recordings which would then be transcribed using this board we have demonstrated.

Conclusion

This demonstration highlights the capability of the Breadboard project’s Hugging Face board to accurately transcribe audio files using the Hugging Face Inference API. Further enhancements and integrations are planned to expand its usability and efficiency. Thank you for exploring this demonstration.

Source