音频编码 Audio Converter

栏目: IOS · 发布时间: 4年前

内容简介:iOS中将采集到的原始音频数据(PCM)进行编码以得到压缩数据类型(AAC...).本例最终实现的是通过Audio Unit采集到PCM数据,将其压缩转为AAC数据,并以录制的形式保存在沙盒中.可调整编码后音频数据格式,采样率,编码器类型等参数.利用Audio Toolbox Framework中的Audio Converter可以实现音频数据编码,即将PCM数据转为其他压缩格式.

iOS中将采集到的原始音频数据(PCM)进行编码以得到压缩数据类型(AAC...).

本例最终实现的是通过Audio Unit采集到PCM数据,将其压缩转为AAC数据,并以录制的形式保存在沙盒中.可调整编码后音频数据格式,采样率,编码器类型等参数.

实现原理

利用Audio Toolbox Framework中的Audio Converter可以实现音频数据编码,即将PCM数据转为其他压缩格式.

阅读前提:

  • Core Audio基本原理:简书, 掘金 , 博客
  • Audio Unit概念篇:简书, 掘金 , 博客
  • 音频采集: Audio Unit简书, 掘金 , 博客
  • 音视频基础知识
  • C,C++基本知识

GitHub地址(附代码) : 音频编码

简书地址 :音频编码

掘金地址 :音频编码

博客地址 :音频编码

1.初始化

1.1. 初始化编码器

初始化编码器实例, 通过指定原始数据格式,最终编码后的格式,采样率,以及使用硬编还是软编,以下是具体步骤.

- (instancetype)initWithSourceFormat:(AudioStreamBasicDescription)sourceFormat destFormatID:(AudioFormatID)destFormatID sampleRate:(float)sampleRate isUseHardwareEncode:(BOOL)isUseHardwareEncode {
    if (self = [super init]) {
        mSourceFormat   = sourceFormat;
        mAudioConverter = [self configureEncoderBySourceFormat:sourceFormat
                                                    destFormat:&mDestinationFormat
                                                  destFormatID:destFormatID
                                                    sampleRate:sampleRate
                                           isUseHardwareEncode:isUseHardwareEncode];
    }
    return self;
}

复制代码

1.2. 配置编码后ASBD音频流信息

AudioStreamBasicDescription destinationFormat = {};
    destinationFormat.mSampleRate = sampleRate;
    if (destFormatID == kAudioFormatLinearPCM) {
        NSLog(@"Not get PCM format after encoding !");
        return NULL;
    } else {
        destinationFormat.mFormatID = destFormatID;
        
        // For iLBC, the number of channels must be 1.
        destinationFormat.mChannelsPerFrame = (destFormatID == kAudioFormatiLBC ? 1 : sourceFormat.mChannelsPerFrame);
        
        // Use AudioFormat API to fill out the rest of the description.
        size = sizeof(destinationFormat);
        if (![self checkError:AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &destinationFormat) withErrorString:@"AudioFormatGetProperty couldn't fill out the destination data format"]) {
            return NULL;
        }
    }
    memcpy(destFormat, &destinationFormat, sizeof(AudioStreamBasicDescription));
复制代码

对音频做编码操作,实际就是将PCM格式转为如AAC等音频压缩格式(VBR格式),通过 kAudioFormatProperty_FormatInfo 属性可以自动获取指定音频格式的参数信息.

注意: 如果音频格式是iLBC, 声道数只能为1.

1.3. 选择编码器类型

AudioClassDescription 结构体描述了系统使用音频编码器信息,其中最重要的就是指定使用硬编或软编。然后编码器的数量,即数组的个数,由当前的声道数决定。

// encoder conut by channels.
    AudioClassDescription requestedCodecs[destinationFormat.mChannelsPerFrame];
    const OSType subtype = destFormatID;
    for (int i = 0; i < destinationFormat.mChannelsPerFrame; i++) {
        AudioClassDescription codec = {
            kAudioEncoderComponentType,
            subtype,
            isUseHardwareEncode ? kAppleHardwareAudioCodecManufacturer : kAppleSoftwareAudioCodecManufacturer,
        };
        requestedCodecs[i] = codec;
    }
复制代码

注意:硬编即利用设备GPU硬件完成高效编码,降低CPU消耗. 软编就是传统的通过CPU计算。

1.4. 创建编码器

AudioConverterNewSpecific : 通过指定编码器来创建audio converter实例对象.第3,4个 分别是编码器的数量与编码器描述,同上,与声道数保持一致.

// Create the AudioConverterRef.
    AudioConverterRef converter = NULL;
    if (![self checkError:AudioConverterNewSpecific(&sourceFormat, &destinationFormat, destinationFormat.mChannelsPerFrame, requestedCodecs, &converter) withErrorString:@"AudioConverterNew failed"]) {
        return NULL;
    }else {
        printf("Audio converter create successful \n");
    }
复制代码

1.5. 设置码率

我们可以手动设置需要的码率,如果没有特殊要求一般可以根据采样率使用建议值,如下.

/*
     If encoding to AAC set the bitrate kAudioConverterEncodeBitRate is a UInt32 value containing
     the number of bits per second to aim for when encoding data when you explicitly set the bit rate
     and the sample rate, this tells the encoder to stick with both bit rate and sample rate
     but there are combinations (also depending on the number of channels) which will not be allowed
     if you do not explicitly set a bit rate the encoder will pick the correct value for you depending
     on samplerate and number of channels bit rate also scales with the number of channels,
     therefore one bit rate per sample rate can be used for mono cases and if you have stereo or more,
     you can multiply that number by the number of channels.
     */
    
    if (destinationFormat.mFormatID == kAudioFormatMPEG4AAC) {
        UInt32 outputBitRate = 64000;
        
        UInt32 propSize = sizeof(outputBitRate);
        
        if (destinationFormat.mSampleRate >= 44100) {
            outputBitRate = 192000;
        } else if (destinationFormat.mSampleRate < 22000) {
            outputBitRate = 32000;
        }
        outputBitRate *= destinationFormat.mChannelsPerFrame;
        
        // Set the bit rate depending on the sample rate chosen.
        if (![self checkError:AudioConverterSetProperty(converter, kAudioConverterEncodeBitRate, propSize, &outputBitRate) withErrorString:@"AudioConverterSetProperty kAudioConverterEncodeBitRate failed!"]) {
            return NULL;
        }
        
        // Get it back and print it out.
        AudioConverterGetProperty(converter, kAudioConverterEncodeBitRate, &propSize, &outputBitRate);
        printf ("AAC Encode Bitrate: %u\n", (unsigned int)outputBitRate);
    }
复制代码

1.6. 设置中断后是否可恢复

kAudioConverterPropertyCanResumeFromInterruption : 设置converter能否在中断后恢复.

如果没有显式实现该属性或get此属性返回错误,说明当前不是硬编,如果此查询返回1表明编码器可以在中断后恢复.否则不能恢复.

/*
     Can the Audio Converter resume after an interruption?
     this property may be queried at any time after construction of the Audio Converter after setting its output format
     there's no clear reason to prefer construction time, interruption time, or potential resumption time but we prefer
     construction time since it means less code to execute during or after interruption time.
     */
    BOOL canResumeFromInterruption = YES;
    UInt32 canResume = 0;
    size = sizeof(canResume);
    OSStatus error = AudioConverterGetProperty(converter, kAudioConverterPropertyCanResumeFromInterruption, &size, &canResume);
    
    if (error == noErr) {
        /*
         we recieved a valid return value from the GetProperty call
         if the property's value is 1, then the codec CAN resume work following an interruption
         if the property's value is 0, then interruptions destroy the codec's state and we're done
         */
        
        if (canResume == 0) {
            canResumeFromInterruption = NO;
        }
        
        printf("Audio Converter %s continue after interruption!\n", (!canResumeFromInterruption ? "CANNOT" : "CAN"));
        
    } else {
        /*
         if the property is unimplemented (kAudioConverterErr_PropertyNotSupported, or paramErr returned in the case of PCM),
         then the codec being used is not a hardware codec so we're not concerned about codec state
         we are always going to be able to resume conversion after an interruption
         */
        
        if (error == kAudioConverterErr_PropertyNotSupported) {
            printf("kAudioConverterPropertyCanResumeFromInterruption property not supported - see comments in source for more info.\n");
            
        } else {
            printf("AudioConverterGetProperty kAudioConverterPropertyCanResumeFromInterruption result %d, paramErr is OK if PCM\n", (int)error);
        }
        
        error = noErr;
    }
复制代码

2.编码

2.1. 估算音频大小

kAudioConverterPropertyMaximumOutputPacketSize : 可以查询编码后音频数据最大数值.此值常用来估算音频编码后最大值.可以通过此值为音频数据分配空间.

UInt32 outputSizePerPacket = destFormat.mBytesPerPacket;
    if (outputSizePerPacket == 0) {
        // if the destination format is VBR, we need to get max size per packet from the converter
        UInt32 size = sizeof(outputSizePerPacket);
        if (![self checkError:AudioConverterGetProperty(audioConverter, kAudioConverterPropertyMaximumOutputPacketSize, &size, &outputSizePerPacket) withErrorString:@"AudioConverterGetProperty kAudioConverterPropertyMaximumOutputPacketSize failed!"]) {
            return;
        }
    }
复制代码

2.2. 为编码后音频数据预分配内存

我们可以将2.1中算出的最大size为这个Buffer list分配内存,也可用原始音频数据的大小为其分配内存,因为我们无法直接得知编码后数据到底是多大,所以用估算出来的最大值或原始数据大小分配内存都可以生效,因为最终编码器会将有效大小的值赋值进去.

// Set up output buffer list.
    AudioBufferList fillBufferList = {};
    fillBufferList.mNumberBuffers = 1;
    fillBufferList.mBuffers[0].mNumberChannels  = destFormat.mChannelsPerFrame;
    fillBufferList.mBuffers[0].mDataByteSize    = theOutputBufferSize;
    fillBufferList.mBuffers[0].mData            = malloc(theOutputBufferSize * sizeof(char));
复制代码

2.3. 编码音频数据

解析 AudioConverterFillComplexBuffer :用来编码音频数据.同时需要指定回调函数(C语言函数),

第二个参数即指定回调函数,此回调函数中主要做的是为即将编码的数据进行赋值,即我们要把原始音频数据赋值给回调函数中的 ioData 参数,这是我们在编码前最后一次控制原始音频数据,此回调函数执行后即完成了编码的过程,新的数据会填充到第五个参数中,也就是我们上面预定义的 fillBufferList .

userInfo
ioOutputDataPackets
outputPacketDescriptions

最终,我们将转换后得到的AAC数据以回调函数的形式传给调用者.

OSStatus EncodeConverterComplexInputDataProc(AudioConverterRef              inAudioConverter,
                                             UInt32                         *ioNumberDataPackets,
                                             AudioBufferList                *ioData,
                                             AudioStreamPacketDescription   **outDataPacketDescription,
                                             void                           *inUserData) {
    XDXConverterInfoType *info = (XDXConverterInfoType *)inUserData;
    ioData->mNumberBuffers              = 1;
    ioData->mBuffers[0].mData           = info->sourceBuffer;
    ioData->mBuffers[0].mNumberChannels = info->sourceChannelsPerFrame;
    ioData->mBuffers[0].mDataByteSize   = info->sourceDataSize;
    
    return noErr;
}

- (void)encodeFormatByConverter:(AudioConverterRef)audioConverter sourceBuffer:(void *)sourceBuffer sourceBufferSize:(UInt32)sourceBufferSize sourceFormat:(AudioStreamBasicDescription)sourceFormat dest:(AudioStreamBasicDescription)destFormat completeHandler:(void(^)(AudioBufferList *destBufferList, UInt32 outputPackets, AudioStreamPacketDescription *outputPacketDescriptions))completeHandler {
    ...
    
    XDXConverterInfoType userInfo   = {0};
    userInfo.sourceBuffer           = sourceBuffer;
    userInfo.sourceDataSize         = sourceBufferSize;
    userInfo.sourceChannelsPerFrame = sourceFormat.mChannelsPerFrame;
    
    UInt32 numberOutputPackets = 1;
    UInt32 theOutputBufferSize = sourceBufferSize;
    UInt32 ioOutputDataPackets = numberOutputPackets;
    AudioStreamPacketDescription outputPacketDescriptions;
    // Convert data
    OSStatus status = AudioConverterFillComplexBuffer(audioConverter,
                                                      EncodeConverterComplexInputDataProc,
                                                      &userInfo,
                                                      &ioOutputDataPackets,
                                                      &fillBufferList,
                                                      &outputPacketDescriptions);
    
    
    
    // if interrupted in the process of the conversion call, we must handle the error appropriately
    if (status != noErr) {
        if (status == kAudioConverterErr_HardwareInUse) {
            printf("Audio Converter returned kAudioConverterErr_HardwareInUse!\n");
        } else {
            if (![self checkError:status withErrorString:@"AudioConverterFillComplexBuffer error!"]) {
                return;
            }
        }
    } else {
        if (ioOutputDataPackets == 0) {
            // This is the EOF condition.
            status = noErr;
        }
        
        completeHandler(&fillBufferList, ioOutputDataPackets, &outputPacketDescriptions);
    }
}

复制代码

3. 模块对接

因为音频编码要依赖音频采集,所以我们这里以audio unit采集为例作示范,即使用audio unit采集pcm数据然后使用此模块编码得到aac数据.如需了解请参考如下链接

  • GitHub地址(附代码) : Audio Unit Capture
  • 简书地址 :Audio Unit Capture
  • 掘金地址 :Audio Unit Capture
  • 博客地址 :Audio Unit Capture

3.1. 初始化编码器

如下,在音频采集的类中声明一个编码器实例变量,然后初始化它. 仅仅需要设置原始数据格式,编码后的格式,采样率,使用硬编,软编即可.

@property (nonatomic, strong) XDXAduioEncoder *audioEncoder;

...

        self->_audioEncoder = [[XDXAduioEncoder alloc] initWithSourceFormat:m_audioDataFormat
                                                               destFormatID:kAudioFormatMPEG4AAC
                                                                 sampleRate:44100
                                                        isUseHardwareEncode:YES];
复制代码

3.2. 编码音频数据

在Audio Unit采集PCM音频数据的回调中将PCM数据送入编码器,然后在回调函数中将得到的AAC数据其写入文件.

static OSStatus AudioCaptureCallback(void                       *inRefCon,
                                     AudioUnitRenderActionFlags *ioActionFlags,
                                     const AudioTimeStamp       *inTimeStamp,
                                     UInt32                     inBusNumber,
                                     UInt32                     inNumberFrames,
                                     AudioBufferList            *ioData) {
    AudioUnitRender(m_audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, m_buffList);
    
    XDXAudioCaptureManager *manager = (__bridge XDXAudioCaptureManager *)inRefCon;

    void    *bufferData = m_buffList->mBuffers[0].mData;
    UInt32   bufferSize = m_buffList->mBuffers[0].mDataByteSize;
    
    [manager.audioEncoder encodeAudioWithSourceBuffer:bufferData
                                       sourceBufferSize:bufferSize
                                        completeHandler:^(AudioBufferList * _Nonnull destBufferList, UInt32 outputPackets, AudioStreamPacketDescription * _Nonnull outputPacketDescriptions) {
                                            if (manager.isRecordVoice) {
                                                [[XDXAudioFileHandler getInstance] writeFileWithInNumBytes:destBufferList->mBuffers->mDataByteSize
                                                                                              ioNumPackets:outputPackets
                                                                                                  inBuffer:destBufferList->mBuffers->mData
                                                                                              inPacketDesc:outputPacketDescriptions];
                                            }
                                            
                                            free(destBufferList->mBuffers->mData);
                                        }];
    

    
    return noErr;
}
复制代码

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

数据结构 Python语言描述

数据结构 Python语言描述

[美] Kenneth A. Lambert 兰伯特 / 李军 / 人民邮电出版社 / 2017-12-1 / CNY 69.00

在计算机科学中,数据结构是一门进阶性课程,概念抽象,难度较大。Python语言的语法简单,交互性强。用Python来讲解数据结构等主题,比C语言等实现起来更为容易,更为清晰。 《数据结构 Python语言描述》第1章简单介绍了Python语言的基础知识和特性。第2章到第4章对抽象数据类型、数据结构、复杂度分析、数组和线性链表结构进行了详细介绍,第5章和第6章重点介绍了面向对象设计的相关知识、......一起来看看 《数据结构 Python语言描述》 这本书的介绍吧!

HTML 压缩/解压工具
HTML 压缩/解压工具

在线压缩/解压 HTML 代码

MD5 加密
MD5 加密

MD5 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器