Skip to main content

Invalid resource manager ID in primary checkpoint record error


Be aware there may be data loss if you follow the steps below and your system is heavily corrupted. Please make sure you have a backup of your data before proceeding.

  1. Get access to Container shell using following command:

     kubectl exec -it <pod-name> -n <namespace> -- /bin/sh

    In docker container, you can use following command:

    docker exec -it <container-id> /bin/sh

    If you cannot get access to container shell, because of the crashloop, make following changes in the deployment manifest and apply it again.

    - name: postgres
    image: postgres:13.2
    imagePullPolicy: IfNotPresent
    command: ["/bin/sh"]
    args: ["-c", "while true; do sleep 30; done;"]
  2. Run following command to reset WAL log files:

    You will need be run following commands as postgres user.

    su - postgres
    # dry run 
    pg_resetwal --dry-run /var/lib/postgresql/data
    pg_resetwal -f /var/lib/postgresql/data
  3. Restart the PostgreSQL container and remove the command and args from the deployment manifest if you have made any changes in the previous step.