PostgreSQL启动恢复读取checkpoint记录失败的条件
栏目: 数据库 · PostgreSQL · 发布时间: 6年前
内容简介:一、那么什么条件下读取的checkpoint记录record==NULL?二、ReadRecord函数返回NULL的条件三、XlogReadRecord读取checkpoint返回NULL的条件?
1、首先读取ControlFile->checkPoint指向的checkpoint 2、如果读取失败,slave直接abort退出,master再次读取ControlFile->prevCheckPoint指向的checkpoint StartupXLOG-> |--checkPointLoc = ControlFile->checkPoint; |--record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1, true): |-- if (record != NULL){ ... }else if (StandbyMode){ ereport(PANIC,(errmsg("could not locate a valid checkpoint record"))); }else{ checkPointLoc = ControlFile->prevCheckPoint; record = ReadCheckpointRecord(xlogreader, checkPointLoc, 2, true); if (record != NULL){ InRecovery = true;//标记下面进入recovery }else{ ereport(PANIC,(errmsg("could not locate a valid checkpoint record"))); } }
一、那么什么条件下读取的checkpoint记录record==NULL?
1、ControlFile->checkPoint % XLOG_BLCKSZ < SizeOfXLogShortPHD 2、ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true)返回NULL 3、ReadRecord读到的record!=NULL && record->xl_rmid != RM_XLOG_ID 4、ReadRecord读到的record!=NULL && info != XLOG_CHECKPOINT_SHUTDOWN && info != XLOG_CHECKPOINT_ONLINE 5、ReadRecord读到的record!=NULL && record->xl_tot_len != SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(CheckPoint)
二、ReadRecord函数返回NULL的条件
ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true) |--record = XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg); |-- 2.1 record==NULL && !StandbyMode |-- 2.2 record!=NULL && !tliInHistory(xlogreader->latestPageTLI, expectedTLEs) /*----- note:只要读取了一页xlog,就会赋值为该页第一个记录的时间线 XLogReaderValidatePageHeader -->xlogreader->latestPageTLI=hdr->xlp_tli; ------*/
三、XlogReadRecord读取checkpoint返回NULL的条件?
XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg) targetPagePtr = ControlFile->checkPoint - (ControlFile->checkPoint % XLOG_BLCKSZ); targetRecOff = ControlFile->checkPoint % XLOG_BLCKSZ; readOff = ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ)); pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) state->readBuf); record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ); total_len = record->xl_tot_len; ------------- 1、readOff < 0 2、0< targetRecOff < pageHeaderSize 3、(((XLogPageHeader) state->readBuf)->xlp_info & XLP_FIRST_IS_CONTRECORD) && targetRecOff == pageHeaderSize page头有跨页的record并且checkpoint定位的偏移正好在页头尾部 4、targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord && !ValidXLogRecordHeader(state, ControlFile->checkPoint, state->ReadRecPtr, record,randAccess) ---(record->xl_tot_len < SizeOfXLogRecord || record->xl_rmid > RM_MAX_ID || record->xl_prev != state->ReadRecPtr) 5、targetRecOff > XLOG_BLCKSZ - SizeOfXLogRecord && total_len < SizeOfXLogRecord 6、total_len > state->readRecordBufSize && !allocate_recordbuf(state, total_len) 一旦该记录损坏,total_len的长度非常大的话,就需要allocate_recordbuf扩展state->readbuf,可能因此分配失败abort 记录的checksum需要等待全部读取完整记录后才校验 -------------
三、ReadPageInternal返回的readOff返回小于0的条件
ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ)) 1、第一次read wal文件,readLen = state->read_page:读取第一页。readLen < 0 2、readLen>0 && !XLogReaderValidatePageHeader(state, targetSegmentPtr, state->readBuf) -- 3、读取checkpoint所在页readLen = state->read_page: readLen < 0 4、readLen > 0 && readLen <= SizeOfXLogShortPHD 5、!XLogReaderValidatePageHeader(state, pageptr, (char *) hdr)
四、XLogPageRead何时返回值<0 ?
/* 1、WaitForWALToBecomeAvailable open失败 2、lseek 失败 && !StandbyMode 3、read失败 && !StandbyMode 4、校验page头失败 && !StandbyMode 如果是StandbyMode,则会重新retry->WaitForWALToBecomeAvailable,切换日志源进行open */ !WaitForWALToBecomeAvailable(targetPagePtr + reqLen,private->randAccess,1,targetRecPtr)//open |-- return -1 readOff = targetPageOff; if (lseek(readFile, (off_t) readOff, SEEK_SET) < 0){ !StandbyMode:: return -1 } if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ){ !StandbyMode:: return -1 } XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf) !StandbyMode:: return -1
五、WaitForWALToBecomeAvailable何时返回false?
--XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL 1、先XLogFileReadAnyTLI open日志: 1、遍历时间线列表里的每一个时间线,从最新的开始 2、当读取checkpoint的时候,source是XLOG_FROM_ANY 3、先找归档的日志进行open;如果open失败再找WAL日志进行open 4、如果都没有open成功,则向前找时间线,open前一个时间线segno和文件号相同的文件进行open 5、open成功后expectedTLEs被赋值为当前时间线列表的所有值 2、如果open失败,则切换日志源:XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL -> XLOG_FROM_STREAM 3、切换日志源后,XLOG_FROM_ARCHIVE | XLOG_FROM_PG_WAL 则: slave && promote :return false !StandbyMode:return false --XLOG_FROM_STREAM 1、!WalRcvStreaming()即receiver进程挂了,切换日志源 2、CheckForStandbyTrigger()切换日志源 3、XLOG_FROM_STREAM->XLOG_FROM_ARCHIVE
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
猜你喜欢:- PostgreSQL启动恢复读取checkpoint记录失败的条件
- 快速失败机制 & 失败安全机制
- 通过不断地失败来避免失败,携程混沌工程实践
- 快速失败(fail-fast)和安全失败(fail-safe)
- greenplum 集群启动失败
- Nginx 失败重试机制
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
Spring in Action
Craig Walls / Manning Publications / 2011-6-29 / USD 49.99
Spring in Action, Third Edition has been completely revised to reflect the latest features, tools, practices Spring offers to java developers. It begins by introducing the core concepts of Spring and......一起来看看 《Spring in Action》 这本书的介绍吧!
Base64 编码/解码
Base64 编码/解码
html转js在线工具
html转js在线工具