c# – 通过Xamarin.Android连接到Microsoft的认知说话人识别API

我正在构建一个测试应用程序,通过Microsoft的Cognitive Speaker Recognition API对用户进行身份验证.这似乎很简单,但正如他们在API Docs中所提到的,在创建注册时,我需要发送我记录的音频文件的byte [].现在,因为我使用Xamarin.Android,我能够录制音频并保存.现在,THAT音频的要求非常具体,微软的认知说话人识别API.

根据API文档,音频文件格式必须满足以下要求.

Container -> WAV
Encoding -> PCM
Rate -> 16K
Sample Format -> 16 bit
Channels -> Mono

this recipe之后,我成功录制了音频,在玩了一些Android文档之后,我也能够实现这些设置:

_recorder.SetOutputFormat(OutputFormat.ThreeGpp);

_recorder.SetAudioChannels(1);
_recorder.SetAudioSamplingRate(16);
_recorder.SetAudioEncodingBitRate(16000);

_recorder.SetAudioEncoder((AudioEncoder) Encoding.Pcm16bit);

这符合所需音频文件的大多数标准.但是,我似乎无法以实际的“.wav”格式保存文件,我无法验证文件是否实际上是PCM编码.

这是我的AXML和MainActivity.cs:Github Gist

我也跟着this code并将其合并到我的代码中:Github Gist

该文件的规格看起来很好,但持续时间是错误的.无论我录制多长时间,它只显示250毫秒,这导致音频太短.

有没有办法做到这一点?基本上我只想通过Xamarin.Android连接到Microsoft的认知说话人识别API.我找不到任何这样的资源来指导自己.

最佳答案
录音

Audio Recorder Plugin NuGet Package添加到Android项目(如果您使用它们,则添加到任何PCL,netstandard或iOS库).

Android项目配置

>在AndroidMainifest.xml中,添加以下权限:

<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.INTERNET" />

>在AndroidManifest.xml中,在< application>< / application>中添加以下提供程序.标签.

<provider android:name="android.support.v4.content.FileProvider" android:authorities="${applicationId}.fileprovider" android:exported="false" android:grantUriPermissions="true">
    <meta-data android:name="android.support.FILE_PROVIDER_PATHS" android:resource="@xml/file_paths"></meta-data>
</provider>

enter image description here

>在Resources文件夹中,创建一个名为xml的新文件夹
>在Resources / xml内部,创建一个名为file_paths.xml的新文件

enter image description here

>在file_paths.xml中,添加以下代码,将[您的包名称]替换为Android项目的包

06002

示例包名称

enter image description here

Android录音机代码

AudioRecorderService AudioRecorder { get; } = new AudioRecorderService
{
    StopRecordingOnSilence = true,
    PreferredSampleRate = 16000
});

public async Task StartRecording()
{
    AudioRecorder.AudioInputReceived += HandleAudioInputReceived;
    await AudioRecorder.StartRecording();
}

public async Task StopRecording()
{
    AudioRecorder.AudioInputReceived += HandleAudioInputReceived;
    await AudioRecorder.StartRecording();
}

async void HandleAudioInputReceived(object sender, string e)
{
    AudioRecorder.AudioInputReceived -= HandleAudioInputReceived;

    PlaybackRecording();

    //replace [UserGuid] with your unique Guid
    await EnrollSpeaker(AudioRecorder.GetAudioFileStream(), [UserGuid]);
}

认知服务说话人识别码

HttpClient Client { get; } = CreateHttpClient(TimeSpan.FromSeconds(10));

public static async Task<EnrollmentStatus?> EnrollSpeaker(Stream audioStream, Guid userGuid)
{
    Enrollment response = null;
    try
    {
        var boundryString = "Upload----" + DateTime.Now.ToString("u").Replace(" ", "");
        var content = new MultipartFormDataContent(boundryString)
        {
            { new StreamContent(audioStream), "enrollmentData", userGuid.ToString("D") + "_" + DateTime.Now.ToString("u") }
        };

        var requestUrl = "https://westus.api.cognitive.microsoft.com/spid/v1.0/verificationProfiles" + "/" + userGuid.ToString("D") + "/enroll";
        var result = await Client.PostAsync(requestUrl, content).ConfigureAwait(false);
        string resultStr = await result.Content.ReadAsStringAsync().ConfigureAwait(false);

        if (result.StatusCode == HttpStatusCode.OK)
            response = JsonConvert.DeserializeObject<Enrollment>(resultStr);

        return response?.EnrollmentStatus;
    }
    catch (Exception)
    {

    }

    return response?.EnrollmentStatus;
}

static HttpClient CreateHttpClient(TimeSpan timeout)
{
    HttpClient client = new HttpClient();

    client.Timeout = timeout;
    client.DefaultRequestHeaders.AcceptEncoding.Add(new StringWithQualityHeaderValue("gzip"));
    client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

    //replace [Your Speaker Recognition API Key] with your Speaker Recognition API Key from the Azure Portal
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", [Your Speaker Recognition API Key]);

    return client;
}

public class Enrollment : EnrollmentBase
{
    [JsonConverter(typeof(StringEnumConverter))]
    public EnrollmentStatus EnrollmentStatus { get; set; }
    public int RemainingEnrollments { get; set; }
    public int EnrollmentsCount { get; set; }
    public string Phrase { get; set; }
}

public enum EnrollmentStatus
{
    Enrolling
    Training,
    Enrolled
}

音频播放

组态

SimpleAudioPlayer Plugin NuGet Package添加到Android项目(如果您使用它们,则添加到任何PCL,netstandard或iOS库).

public void PlaybackRecording()
{
    var isAudioLoaded = Plugin.SimpleAudioPlayer.CrossSimpleAudioPlayer.Current.Load(AudioRecorder.GetAudioFileStream());

    if (isAudioLoaded)
        Plugin.SimpleAudioPlayer.CrossSimpleAudioPlayer.Current.Play();
}

转载注明原文:c# – 通过Xamarin.Android连接到Microsoft的认知说话人识别API - 代码日志