Troubleshooting

    For known issues, see the MTC release notes.

    You can migrate Kubernetes resources, persistent volume data, and internal container images to OKD 4.8 by using the Migration Toolkit for Containers (MTC) web console or the Kubernetes API.

    MTC migrates the following resources:

    • A namespace specified in a migration plan.

    • Namespace-scoped resources: When the MTC migrates a namespace, it migrates all the objects and resources associated with that namespace, such as services or pods. Additionally, if a resource that exists in the namespace but not at the cluster level depends on a resource that exists at the cluster level, the MTC migrates both resources.

      For example, a security context constraint (SCC) is a resource that exists at the cluster level and a service account (SA) is a resource that exists at the namespace level. If an SA exists in a namespace that the MTC migrates, the MTC automatically locates any SCCs that are linked to the SA and also migrates those SCCs. Similarly, the MTC migrates persistent volume claims that are linked to the persistent volumes of the namespace.

    • Custom resources (CRs) and custom resource definitions (CRDs): MTC automatically migrates CRs and CRDs at the namespace level.

    Migrating an application with the MTC web console involves the following steps:

    1. Install the Migration Toolkit for Containers Operator on all clusters.

      You can install the Migration Toolkit for Containers Operator in a restricted environment with limited or no internet access. The source and target clusters must have network access to each other and to a mirror registry.

    2. Configure the replication repository, an intermediate object storage that MTC uses to migrate data.

      The source and target clusters must have network access to the replication repository during migration. In a restricted environment, you can use Multi-Cloud Object Gateway (MCG). If you are using a proxy server, you must configure it to allow network traffic between the replication repository and the clusters.

    3. Add the source cluster to the MTC web console.

    4. Add the replication repository to the MTC web console.

    5. Create a migration plan, with one of the following data migration options:

      • Copy: MTC copies the data from the source cluster to the replication repository, and from the replication repository to the target cluster.

        If you are using direct image migration or direct volume migration, the images or volumes are copied directly from the source cluster to the target cluster.

      • Move: MTC unmounts a remote volume, for example, NFS, from the source cluster, creates a PV resource on the target cluster pointing to the remote volume, and then mounts the remote volume on the target cluster. Applications running on the target cluster use the same remote volume that the source cluster was using. The remote volume must be accessible to the source and target clusters.

        Although the replication repository does not appear in this diagram, it is required for migration.

        migration PV move

    6. Run the migration plan, with one of the following options:

      • Stage copies data to the target cluster without stopping the application.

        A stage migration can be run multiple times so that most of the data is copied to the target before migration. Running one or more stage migrations reduces the duration of the cutover migration.

      • Cutover stops the application on the source cluster and moves the resources to the target cluster.

        Optional: You can clear the Halt transactions on the source cluster during migration checkbox.

    The Migration Toolkit for Containers (MTC) creates the following custom resources (CRs):

    migration architecture diagram

    (configuration, MTC cluster): Cluster definition

    20 MigStorage (configuration, MTC cluster): Storage definition

    (configuration, MTC cluster): Migration plan

    The CR describes the source and target clusters, replication repository, and namespaces being migrated. It is associated with 0, 1, or many MigMigration CRs.

    Deleting a MigPlan CR deletes the associated MigMigration CRs.

    20 BackupStorageLocation (configuration, MTC cluster): Location of Velero backup objects

    (configuration, MTC cluster): Location of Velero volume snapshots

    20 MigMigration (action, MTC cluster): Migration, created every time you stage or migrate data. Each MigMigration CR is associated with a MigPlan CR.

    (action, source cluster): When you run a migration plan, the MigMigration CR creates two Velero backup CRs on each source cluster:

    • Backup CR #1 for Kubernetes objects

    • Backup CR #2 for PV data

    20 Restore (action, target cluster): When you run a migration plan, the MigMigration CR creates two Velero restore CRs on the target cluster:

    • Restore CR #1 (using Backup CR #2) for PV data

    • Restore CR #2 (using Backup CR #1) for Kubernetes objects

    MTC custom resource manifests

    Migration Toolkit for Containers (MTC) uses the following custom resource (CR) manifests for migrating applications.

    DirectImageMigration

    The DirectImageMigration CR copies images directly from the source cluster to the destination cluster.

    1One or more namespaces containing images to be migrated. By default, the destination namespace has the same name as the source namespace.
    2Source namespace mapped to a destination namespace with a different name.

    DirectImageStreamMigration

    The DirectImageStreamMigration CR copies image stream references directly from the source cluster to the destination cluster.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: DirectImageStreamMigration
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <direct_image_stream_migration>
    7. spec:
    8. srcMigClusterRef:
    9. name: <source_cluster>
    10. namespace: openshift-migration
    11. destMigClusterRef:
    12. name: <destination_cluster>
    13. namespace: openshift-migration
    14. imageStreamRef:
    15. name: <image_stream>
    16. namespace: <source_image_stream_namespace>
    17. destNamespace: <destination_image_stream_namespace>

    DirectVolumeMigration

    The DirectVolumeMigration CR copies persistent volumes (PVs) directly from the source cluster to the destination cluster.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: DirectVolumeMigration
    3. metadata:
    4. name: <direct_volume_migration>
    5. namespace: openshift-migration
    6. spec:
    7. createDestinationNamespaces: false (1)
    8. deleteProgressReportingCRs: false (2)
    9. destMigClusterRef:
    10. name: <host_cluster> (3)
    11. namespace: openshift-migration
    12. persistentVolumeClaims:
    13. - name: <pvc> (4)
    14. namespace: <pvc_namespace>
    15. srcMigClusterRef:
    16. name: <source_cluster>
    17. namespace: openshift-migration
    1Set to true to create namespaces for the PVs on the destination cluster.
    2Set to true to delete DirectVolumeMigrationProgress CRs after migration. The default is false so that DirectVolumeMigrationProgress CRs are retained for troubleshooting.
    3Update the cluster name if the destination cluster is not the host cluster.
    4Specify one or more PVCs to be migrated.

    DirectVolumeMigrationProgress

    The DirectVolumeMigrationProgress CR shows the progress of the DirectVolumeMigration CR.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: DirectVolumeMigrationProgress
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <direct_volume_migration_progress>
    7. spec:
    8. clusterRef:
    9. name: <source_cluster>
    10. namespace: openshift-migration
    11. podRef:
    12. name: <rsync_pod>
    13. namespace: openshift-migration

    MigAnalytic

    The MigAnalytic CR collects the number of images, Kubernetes resources, and the persistent volume (PV) capacity from an associated MigPlan CR.

    You can configure the data that it collects.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigAnalytic
    3. metadata:
    4. annotations:
    5. migplan: <migplan>
    6. name: <miganalytic>
    7. namespace: openshift-migration
    8. labels:
    9. migplan: <migplan>
    10. spec:
    11. analyzeImageCount: true (1)
    12. analyzeK8SResources: true (2)
    13. analyzePVCapacity: true (3)
    14. listImages: false (4)
    15. listImagesLimit: 50 (5)
    16. migPlanRef:
    17. name: <migplan>
    18. namespace: openshift-migration
    1Optional: Returns the number of images.
    2Optional: Returns the number, kind, and API version of the Kubernetes resources.
    3Optional: Returns the PV capacity.
    4Returns a list of image names. The default is false so that the output is not excessively long.
    5Optional: Specify the maximum number of image names to return if listImages is true.

    MigCluster

    The MigCluster CR defines a host, local, or remote cluster.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigCluster
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <host_cluster> (1)
    7. namespace: openshift-migration
    8. spec:
    9. isHostCluster: true (2)
    10. # The 'azureResourceGroup' parameter is relevant only for Microsoft Azure.
    11. azureResourceGroup: <azure_resource_group> (3)
    12. caBundle: <ca_bundle_base64> (4)
    13. insecure: false (5)
    14. refresh: false (6)
    15. # The 'restartRestic' parameter is relevant for a source cluster.
    16. restartRestic: true (7)
    17. # The following parameters are relevant for a remote cluster.
    18. exposedRegistryPath: <registry_route> (8)
    19. url: <destination_cluster_url> (9)
    20. serviceAccountSecretRef:
    21. name: <source_secret> (10)
    22. namespace: openshift-config
    1Update the cluster name if the migration-controller pod is not running on this cluster.
    2The migration-controller pod runs on this cluster if true.
    3Microsoft Azure only: Specify the resource group.
    4Optional: If you created a certificate bundle for self-signed CA certificates and if the insecure parameter value is false, specify the base64-encoded certificate bundle.
    5Set to true to disable SSL verification.
    6Set to true to validate the cluster.
    7Set to true to restart the Restic pods on the source cluster after the Stage pods are created.
    8Remote cluster and direct image migration only: Specify the exposed secure registry path.
    9Remote cluster only: Specify the URL.
    10Remote cluster only: Specify the name of the Secret CR.

    MigHook

    The MigHook CR defines a migration hook that runs custom code at a specified stage of the migration. You can create up to four migration hooks. Each hook runs during a different phase of the migration.

    You can configure the hook name, runtime duration, a custom image, and the cluster where the hook will run.

    The migration phases and namespaces of the hooks are configured in the MigPlan CR.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigHook
    3. metadata:
    4. generateName: <hook_name_prefix> (1)
    5. name: <mighook> (2)
    6. namespace: openshift-migration
    7. spec:
    8. activeDeadlineSeconds: 1800 (3)
    9. custom: false (4)
    10. image: <hook_image> (5)
    11. playbook: <ansible_playbook_base64> (6)
    12. targetCluster: source (7)
    1Optional: A unique hash is appended to the value for this parameter so that each migration hook has a unique name. You do not need to specify the value of the name parameter.
    2Specify the migration hook name, unless you specify the value of the generateName parameter.
    3Optional: Specify the maximum number of seconds that a hook can run. The default is 1800.
    4The hook is a custom image if true. The custom image can include Ansible or it can be written in a different programming language.
    5Specify the custom image, for example, quay.io/konveyor/hook-runner:latest. Required if custom is true.
    6Base64-encoded Ansible playbook. Required if custom is false.
    7Specify the cluster on which the hook will run. Valid values are source or destination.

    MigMigration

    The MigMigration CR runs a MigPlan CR.

    You can configure a Migmigration CR to run a stage or incremental migration, to cancel a migration in progress, or to roll back a completed migration.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigMigration
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <migmigration>
    7. namespace: openshift-migration
    8. spec:
    9. canceled: false (1)
    10. rollback: false (2)
    11. stage: false (3)
    12. quiescePods: true (4)
    13. keepAnnotations: true (5)
    14. verify: false (6)
    15. migPlanRef:
    16. name: <migplan>
    17. namespace: openshift-migration

    MigPlan

    The MigPlan CR defines the parameters of a migration plan.

    You can configure destination namespaces, hook phases, and direct or indirect migration.

    By default, a destination namespace has the same name as the source namespace. If you configure a different destination namespace, you must ensure that the namespaces are not duplicated on the source or the destination clusters because the UID and GID ranges are copied during migration.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigPlan
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <migplan>
    7. namespace: openshift-migration
    8. spec:
    9. closed: false (1)
    10. srcMigClusterRef:
    11. name: <source_cluster>
    12. namespace: openshift-migration
    13. destMigClusterRef:
    14. name: <destination_cluster>
    15. namespace: openshift-migration
    16. hooks: (2)
    17. - executionNamespace: <namespace> (3)
    18. phase: <migration_phase> (4)
    19. reference:
    20. name: <hook> (5)
    21. namespace: <hook_namespace> (6)
    22. serviceAccount: <service_account> (7)
    23. indirectVolumeMigration: false (9)
    24. migStorageRef:
    25. name: <migstorage>
    26. namespace: openshift-migration
    27. namespaces:
    28. - <source_namespace_1> (10)
    29. - <source_namespace_2>
    30. refresh: false (12)
    1The migration has completed if true. You cannot create another MigMigration CR for this MigPlan CR.
    2Optional: You can specify up to four migration hooks. Each hook must run during a different migration phase.
    3Optional: Specify the namespace in which the hook will run.
    4Optional: Specify the migration phase during which a hook runs. One hook can be assigned to one phase. Valid values are PreBackup, PostBackup, PreRestore, and PostRestore.
    5Optional: Specify the name of the MigHook CR.
    6Optional: Specify the namespace of MigHook CR.
    7Optional: Specify a service account with cluster-admin privileges.
    8Direct image migration is disabled if true. Images are copied from the source cluster to the replication repository and from the replication repository to the destination cluster.
    9Direct volume migration is disabled if true. PVs are copied from the source cluster to the replication repository and from the replication repository to the destination cluster.
    10Specify one or more source namespaces. If you specify only the source namespace, the destination namespace is the same.
    11Specify the destination namespace if it is different from the source namespace.
    12The MigPlan CR is validated if true.

    MigStorage

    The MigStorage CR describes the object storage for the replication repository.

    Amazon Web Services (AWS), Microsoft Azure, Google Cloud Storage, Multi-Cloud Object Gateway, and generic S3-compatible cloud storage are supported.

    AWS and the snapshot copy method have additional parameters.

    1. apiVersion: migration.openshift.io/v1alpha1
    2. kind: MigStorage
    3. metadata:
    4. labels:
    5. controller-tools.k8s.io: "1.0"
    6. name: <migstorage>
    7. namespace: openshift-migration
    8. spec:
    9. backupStorageProvider: <backup_storage_provider> (1)
    10. volumeSnapshotProvider: <snapshot_storage_provider> (2)
    11. backupStorageConfig:
    12. awsBucketName: <bucket> (3)
    13. awsRegion: <region> (4)
    14. credsSecretRef:
    15. namespace: openshift-config
    16. name: <storage_secret> (5)
    17. awsKmsKeyId: <key_id> (6)
    18. awsPublicUrl: <public_url> (7)
    19. awsSignatureVersion: <signature_version> (8)
    20. volumeSnapshotConfig:
    21. awsRegion: <region> (9)
    22. credsSecretRef:
    23. namespace: openshift-config
    24. name: <storage_secret> (10)
    25. refresh: false (11)
    1Specify the storage provider.
    2Snapshot copy method only: Specify the storage provider.
    3AWS only: Specify the bucket name.
    4AWS only: Specify the bucket region, for example, us-east-1.
    5Specify the name of the Secret CR that you created for the storage.
    6AWS only: If you are using the AWS Key Management Service, specify the unique identifier of the key.
    7AWS only: If you granted public access to the AWS bucket, specify the bucket URL.
    8AWS only: Specify the AWS signature version for authenticating requests to the bucket, for example, 4.
    9Snapshot copy method only: Specify the geographical region of the clusters.
    10Snapshot copy method only: Specify the name of the Secret CR that you created for the storage.
    11Set to true to validate the cluster.

    This section describes logs and debugging tools that you can use for troubleshooting.

    Viewing migration plan resources

    You can view migration plan resources to monitor a running migration or to troubleshoot a failed migration by using the MTC web console and the command line interface (CLI).

    Procedure

    1. In the MTC web console, click Migration Plans.

    2. Click the Migrations number next to a migration plan to view the Migrations page.

    3. Click a migration to view the Migration details.

    4. Expand Migration resources to view the migration resources and their status in a tree view.

      To troubleshoot a failed migration, start with a high-level resource that has failed and then work down the resource tree towards the lower-level resources.

    5. Click the Options menu next to a resource and select one of the following options:

      • Copy oc describe command copies the command to your clipboard.

        • Log in to the relevant cluster and then run the command.

          The conditions and events of the resource are displayed in YAML format.

      • Copy oc logs command copies the command to your clipboard.

        • Log in to the relevant cluster and then run the command.

          If the resource supports log filtering, a filtered log is displayed.

      • View JSON displays the resource data in JSON format in a web browser.

        The data is the same as the output for the oc get <resource> command.

    Viewing a migration plan log

    You can view an aggregated log for a migration plan. You use the MTC web console to copy a command to your clipboard and then run the command from the command line interface (CLI).

    The command displays the filtered logs of the following pods:

    • Migration Controller

    • Velero

    • Restic

    • Rsync

    • Stunnel

    • Registry

    Procedure

    1. In the MTC web console, click Migration Plans.

    2. Click the Migrations number next to a migration plan.

    3. Click the Copy icon to copy the oc logs command to your clipboard.

    4. Log in to the relevant cluster and enter the command on the CLI.

      The aggregated log for the migration plan is displayed.

    Using the migration log reader

    You can use the migration log reader to display a single filtered view of all the migration logs.

    Procedure

    1. Get the mig-log-reader pod:

      1. $ oc -n openshift-migration get pods | grep log
    2. Enter the following command to display a single migration log:

      1. $ oc -n openshift-migration logs -f <mig-log-reader-pod> -c color (1)
      1The -c plain option displays the log without colors.

    You can collect logs, metrics, and information about MTC custom resources by using the must-gather tool.

    The must-gather data must be attached to all customer cases.

    You can collect data for a one-hour or a 24-hour period and view the data with the Prometheus console.

    Prerequisites

    • You must be logged in to the OKD cluster as a user with the cluster-admin role.

    • You must have the OpenShift CLI (oc) installed.

    Procedure

    1. Navigate to the directory where you want to store the must-gather data.

    2. Run the oc adm must-gather command:

      • To gather data for the past hour:

        1. $ oc adm must-gather --image=registry.redhat.io/rhmtc/openshift-migration-must-gather-rhel8:v1.6

        The data is saved as /must-gather/must-gather.tar.gz. You can upload this file to a support case on the .

      • To gather data for the past 24 hours:

        1. $ oc adm must-gather --image= \
        2. registry.redhat.io/rhmtc/openshift-migration-must-gather-rhel8: \
        3. v1.6 -- /usr/bin/gather_metrics_dump

        This operation can take a long time. The data is saved as /must-gather/metrics/prom_data.tar.gz. You can view this file with the Prometheus console.

    To view data with the Prometheus console

    1. Create a local Prometheus instance:

      1. $ make prometheus-run

      The command outputs the Prometheus URL:

      Output

      1. Started Prometheus on http://localhost:9090
    2. Launch a web browser and navigate to the URL to view the data by using the Prometheus web console.

    3. After you have viewed the data, delete the Prometheus instance and data:

      1. $ make prometheus-cleanup

    Using the Velero CLI to debug Backup and Restore CRs

    You can debug the Backup and Restore custom resources (CRs) and partial migration failures with the Velero command line interface (CLI). The Velero CLI runs in the velero pod.

    Velero command syntax

    Velero CLI commands use the following syntax:

    1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -- ./velero <resource> <command> <resource_id>

    You can specify velero-<pod> -n openshift-migration in place of $(oc get pods -n openshift-migration -o name | grep velero).

    Help command

    The Velero help command lists all the Velero CLI commands:

    1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -- ./velero --help

    Describe command

    The Velero describe command provides a summary of warnings and errors associated with a Velero resource:

    1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -- ./velero <resource> describe <resource_id>

    Example

    1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -- ./velero backup describe 0e44ae00-5dc3-11eb-9ca8-df7e5254778b-2d8ql

    Logs command

    The Velero logs command provides the logs associated with a Velero resource:

    1. velero <resource> logs <resource_id>

    Example

    Debugging a partial migration failure

    You can debug a partial migration failure warning message by using the Velero CLI to examine the Restore custom resource (CR) logs.

    A partial failure occurs when Velero encounters an issue that does not cause a migration to fail. For example, if a custom resource definition (CRD) is missing or if there is a discrepancy between CRD versions on the source and target clusters, the migration completes but the CR is not created on the target cluster.

    Velero logs the issue as a partial failure and then processes the rest of the objects in the Backup CR.

    Procedure

    1. Check the status of a MigMigration CR:

      1. $ oc get migmigration <migmigration> -o yaml

      Example output

      1. status:
      2. conditions:
      3. - category: Warn
      4. durable: true
      5. lastTransitionTime: "2021-01-26T20:48:40Z"
      6. message: 'Final Restore openshift-migration/ccc7c2d0-6017-11eb-afab-85d0007f5a19-x4lbf: partially failed on destination cluster'
      7. status: "True"
      8. type: VeleroFinalRestorePartiallyFailed
      9. - category: Advisory
      10. durable: true
      11. lastTransitionTime: "2021-01-26T20:48:42Z"
      12. message: The migration has completed with warnings, please look at `Warn` conditions.
      13. reason: Completed
      14. status: "True"
      15. type: SucceededWithWarnings
    2. Check the status of the Restore CR by using the Velero describe command:

      1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -n openshift-migration -- ./velero restore describe <restore>

      Example output

      1. Phase: PartiallyFailed (run 'velero restore logs ccc7c2d0-6017-11eb-afab-85d0007f5a19-x4lbf' for more information)
      2. Errors:
      3. Velero: <none>
      4. Cluster: <none>
      5. Namespaces:
      6. migration-example: error restoring example.com/migration-example/migration-example: the server could not find the requested resource
    3. Check the Restore CR logs by using the Velero logs command:

      1. $ oc exec $(oc get pods -n openshift-migration -o name | grep velero) -n openshift-migration -- ./velero restore logs <restore>

      Example output

      1. time="2021-01-26T20:48:37Z" level=info msg="Attempting to restore migration-example: migration-example" logSource="pkg/restore/restore.go:1107" restore=openshift-migration/ccc7c2d0-6017-11eb-afab-85d0007f5a19-x4lbf
      2. time="2021-01-26T20:48:37Z" level=info msg="error restoring migration-example: the server could not find the requested resource" logSource="pkg/restore/restore.go:1170" restore=openshift-migration/ccc7c2d0-6017-11eb-afab-85d0007f5a19-x4lbf

      The Restore CR log error message, the server could not find the requested resource, indicates the cause of the partially failed migration.

    Using MTC custom resources for troubleshooting

    You can check the following Migration Toolkit for Containers (MTC) custom resources (CRs) to troubleshoot a failed migration:

    • MigCluster

    • MigStorage

    • MigPlan

    • BackupStorageLocation

      The BackupStorageLocation CR contains a migrationcontroller label to identify the MTC instance that created the CR:

      1. labels:
      2. migrationcontroller: ebe13bee-c803-47d0-a9e9-83f380328b93
    • VolumeSnapshotLocation

      The VolumeSnapshotLocation CR contains a migrationcontroller label to identify the MTC instance that created the CR:

      1. labels:
      2. migrationcontroller: ebe13bee-c803-47d0-a9e9-83f380328b93
    • MigMigration

    • Backup

      MTC changes the reclaim policy of migrated persistent volumes (PVs) to Retain on the target cluster. The Backup CR contains an openshift.io/orig-reclaim-policy annotation that indicates the original reclaim policy. You can manually restore the reclaim policy of the migrated PVs.

    • Restore

    Procedure

    1. List the MigMigration CRs in the openshift-migration namespace:

      1. $ oc get migmigration -n openshift-migration

      Example output

      1. NAME AGE
      2. 88435fe0-c9f8-11e9-85e6-5d593ce65e10 6m42s
    2. Inspect the MigMigration CR:

      1. $ oc describe migmigration 88435fe0-c9f8-11e9-85e6-5d593ce65e10 -n openshift-migration

      The output is similar to the following examples.

    MigMigration example output

    1. name: 88435fe0-c9f8-11e9-85e6-5d593ce65e10
    2. namespace: openshift-migration
    3. labels: <none>
    4. annotations: touch: 3b48b543-b53e-4e44-9d34-33563f0f8147
    5. apiVersion: migration.openshift.io/v1alpha1
    6. kind: MigMigration
    7. metadata:
    8. creationTimestamp: 2019-08-29T01:01:29Z
    9. generation: 20
    10. resourceVersion: 88179
    11. selfLink: /apis/migration.openshift.io/v1alpha1/namespaces/openshift-migration/migmigrations/88435fe0-c9f8-11e9-85e6-5d593ce65e10
    12. uid: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
    13. spec:
    14. migPlanRef:
    15. name: socks-shop-mig-plan
    16. namespace: openshift-migration
    17. quiescePods: true
    18. stage: false
    19. status:
    20. conditions:
    21. category: Advisory
    22. durable: True
    23. lastTransitionTime: 2019-08-29T01:03:40Z
    24. message: The migration has completed successfully.
    25. reason: Completed
    26. status: True
    27. type: Succeeded
    28. phase: Completed
    29. startTimestamp: 2019-08-29T01:01:29Z
    30. events: <none>

    Velero backup CR #2 example output that describes the PV data

    1. apiVersion: velero.io/v1
    2. kind: Backup
    3. metadata:
    4. annotations:
    5. openshift.io/migrate-copy-phase: final
    6. openshift.io/migrate-quiesce-pods: "true"
    7. openshift.io/migration-registry: 172.30.105.179:5000
    8. openshift.io/migration-registry-dir: /socks-shop-mig-plan-registry-44dd3bd5-c9f8-11e9-95ad-0205fe66cbb6
    9. openshift.io/orig-reclaim-policy: delete
    10. creationTimestamp: "2019-08-29T01:03:15Z"
    11. generateName: 88435fe0-c9f8-11e9-85e6-5d593ce65e10-
    12. generation: 1
    13. labels:
    14. app.kubernetes.io/part-of: migration
    15. migmigration: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
    16. migration-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
    17. velero.io/storage-location: myrepo-vpzq9
    18. name: 88435fe0-c9f8-11e9-85e6-5d593ce65e10-59gb7
    19. namespace: openshift-migration
    20. resourceVersion: "87313"
    21. selfLink: /apis/velero.io/v1/namespaces/openshift-migration/backups/88435fe0-c9f8-11e9-85e6-5d593ce65e10-59gb7
    22. uid: c80dbbc0-c9f8-11e9-95ad-0205fe66cbb6
    23. spec:
    24. excludedNamespaces: []
    25. excludedResources: []
    26. hooks:
    27. resources: []
    28. includeClusterResources: null
    29. - sock-shop
    30. includedResources:
    31. - persistentvolumes
    32. - persistentvolumeclaims
    33. - namespaces
    34. - imagestreams
    35. - imagestreamtags
    36. - secrets
    37. - configmaps
    38. labelSelector:
    39. matchLabels:
    40. migration-included-stage-backup: 8886de4c-c9f8-11e9-95ad-0205fe66cbb6
    41. storageLocation: myrepo-vpzq9
    42. ttl: 720h0m0s
    43. volumeSnapshotLocations:
    44. - myrepo-wv6fx
    45. status:
    46. completionTimestamp: "2019-08-29T01:02:36Z"
    47. errors: 0
    48. expiration: "2019-09-28T01:02:35Z"
    49. phase: Completed
    50. startTimestamp: "2019-08-29T01:02:35Z"
    51. validationErrors: null
    52. version: 1
    53. volumeSnapshotsAttempted: 0
    54. volumeSnapshotsCompleted: 0
    55. warnings: 0

    Velero restore CR #2 example output that describes the Kubernetes resources

    1. apiVersion: velero.io/v1
    2. kind: Restore
    3. metadata:
    4. annotations:
    5. openshift.io/migrate-copy-phase: final
    6. openshift.io/migrate-quiesce-pods: "true"
    7. openshift.io/migration-registry: 172.30.90.187:5000
    8. openshift.io/migration-registry-dir: /socks-shop-mig-plan-registry-36f54ca7-c925-11e9-825a-06fa9fb68c88
    9. creationTimestamp: "2019-08-28T00:09:49Z"
    10. generateName: e13a1b60-c927-11e9-9555-d129df7f3b96-
    11. generation: 3
    12. labels:
    13. app.kubernetes.io/part-of: migration
    14. migmigration: e18252c9-c927-11e9-825a-06fa9fb68c88
    15. migration-final-restore: e18252c9-c927-11e9-825a-06fa9fb68c88
    16. name: e13a1b60-c927-11e9-9555-d129df7f3b96-gb8nx
    17. namespace: openshift-migration
    18. resourceVersion: "82329"
    19. selfLink: /apis/velero.io/v1/namespaces/openshift-migration/restores/e13a1b60-c927-11e9-9555-d129df7f3b96-gb8nx
    20. uid: 26983ec0-c928-11e9-825a-06fa9fb68c88
    21. spec:
    22. backupName: e13a1b60-c927-11e9-9555-d129df7f3b96-sz24f
    23. excludedNamespaces: null
    24. excludedResources:
    25. - nodes
    26. - events
    27. - events.events.k8s.io
    28. - backups.velero.io
    29. - restores.velero.io
    30. - resticrepositories.velero.io
    31. includedNamespaces: null
    32. includedResources: null
    33. namespaceMapping: null
    34. restorePVs: true
    35. status:
    36. errors: 0
    37. failureReason: ""
    38. phase: Completed
    39. validationErrors: null
    40. warnings: 15

    This section describes common issues and concerns that can cause issues during migration.

    Direct volume migration does not complete

    If direct volume migration does not complete, the target cluster might not have the same node-selector annotations as the source cluster.

    Migration Toolkit for Containers (MTC) migrates namespaces with all annotations to preserve security context constraints and scheduling requirements. During direct volume migration, MTC creates Rsync transfer pods on the target cluster in the namespaces that were migrated from the source cluster. If a target cluster namespace does not have the same annotations as the source cluster namespace, the Rsync transfer pods cannot be scheduled. The Rsync pods remain in a Pending state.

    You can identify and fix this issue by performing the following procedure.

    Procedure

    1. Check the status of the MigMigration CR:

      1. $ oc describe migmigration <pod> -n openshift-migration

      The output includes the following status message:

      Example output

      1. Some or all transfer pods are not running for more than 10 mins on destination cluster
    2. On the source cluster, obtain the details of a migrated namespace:

      1. $ oc get namespace <namespace> -o yaml (1)
      1Specify the migrated namespace.
    3. On the target cluster, edit the migrated namespace:

      1. $ oc edit namespace <namespace>
    4. Add the missing openshift.io/node-selector annotations to the migrated namespace as in the following example:

      1. apiVersion: v1
      2. kind: Namespace
      3. metadata:
      4. annotations:
      5. openshift.io/node-selector: "region=east"
      6. ...
    5. Run the migration plan again.

    This section describes common error messages you might encounter with the Migration Toolkit for Containers (MTC) and how to resolve their underlying causes.

    CA certificate error displayed when accessing the MTC console for the first time

    If a CA certificate error message is displayed the first time you try to access the MTC console, the likely cause is the use of self-signed CA certificates in one of the clusters.

    To resolve this issue, navigate to the oauth-authorization-server URL displayed in the error message and accept the certificate. To resolve this issue permanently, add the certificate to the trust store of your web browser.

    If an Unauthorized message is displayed after you have accepted the certificate, navigate to the MTC console and refresh the web page.

    OAuth timeout error in the MTC console

    If a connection has timed out message is displayed in the MTC console after you have accepted a self-signed certificate, the causes are likely to be the following:

    To determine the cause of the timeout:

    • Inspect the MTC console web page with a browser web inspector.

    • Check the Migration UI pod log for errors.

    Certificate signed by unknown authority error

    If you use a self-signed certificate to secure a cluster or a replication repository for the Migration Toolkit for Containers (MTC), certificate verification might fail with the following error message: Certificate signed by unknown authority.

    You can create a custom CA certificate bundle file and upload it in the MTC web console when you add a cluster or a replication repository.

    Procedure

    1. $ echo -n | openssl s_client -connect <host_FQDN>:<port> \ (1)
    2. | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > <ca_bundle.cert> (2)
    1Specify the host FQDN and port of the endpoint, for example, api.my-cluster.example.com:6443.
    2Specify the name of the CA bundle file.

    Backup storage location errors in the Velero pod log

    If a Velero Backup custom resource contains a reference to a backup storage location (BSL) that does not exist, the Velero pod log might display the following error messages:

    1. $ oc logs <MigrationUI_Pod> -n openshift-migration

    You can ignore these error messages. A missing BSL cannot cause a migration to fail.

    Pod volume backup timeout error in the Velero pod log

    If a migration fails because Restic times out, the following error is displayed in the Velero pod log.

    The default value of restic_timeout is one hour. You can increase this parameter for large migrations, keeping in mind that a higher value may delay the return of error messages.

    Procedure

    1. In the OKD web console, navigate to OperatorsInstalled Operators.

    2. Click Migration Toolkit for Containers Operator.

    3. In the MigrationController tab, click migration-controller.

    4. In the YAML tab, update the following parameter value:

      1. spec:
      2. restic_timeout: 1h (1)
      1Valid units are h (hours), m (minutes), and s (seconds), for example, 3h30m15s.
    5. Click Save.

    Restic verification errors in the MigMigration custom resource

    If data verification fails when migrating a persistent volume with the file system data copy method, the following error is displayed in the MigMigration CR.

    Example output

    1. status:
    2. conditions:
    3. - category: Warn
    4. durable: true
    5. lastTransitionTime: 2020-04-16T20:35:16Z
    6. message: There were verify errors found in 1 Restic volume restores. See restore `<registry-example-migration-rvwcm>`
    7. for details (1)
    8. status: "True"
    9. type: ResticVerifyErrors (2)

    A data verification error does not cause the migration process to fail.

    You can check the Restore CR to identify the source of the data verification error.

    Procedure

    1. Log in to the target cluster.

    2. View the Restore CR:

      1. $ oc describe <registry-example-migration-rvwcm> -n openshift-migration

      The output identifies the persistent volume with PodVolumeRestore errors.

      Example output

      1. status:
      2. phase: Completed
      3. podVolumeRestoreErrors:
      4. - kind: PodVolumeRestore
      5. name: <registry-example-migration-rvwcm-98t49>
      6. namespace: openshift-migration
      7. podVolumeRestoreResticErrors:
      8. - kind: PodVolumeRestore
      9. name: <registry-example-migration-rvwcm-98t49>
      10. namespace: openshift-migration
    3. View the PodVolumeRestore CR:

      1. $ oc describe <migration-example-rvwcm-98t49>

      The output identifies the Restic pod that logged the errors.

      Example output

      1. completionTimestamp: 2020-05-01T20:49:12Z
      2. errors: 1
      3. resticErrors: 1
      4. ...
      5. resticPod: <restic-nr2v5>
    4. View the Restic pod log to locate the errors:

      1. $ oc logs -f <restic-nr2v5>

    Restic permission error when migrating from NFS storage with root_squash enabled

    If you are migrating data from NFS storage and root_squash is enabled, Restic maps to nfsnobody and does not have permission to perform the migration. The following error is displayed in the Restic pod log.

    Example output

    1. backup=openshift-migration/<backup_id> controller=pod-volume-backup error="fork/exec /usr/bin/restic: permission denied" error.file="/go/src/github.com/vmware-tanzu/velero/pkg/controller/pod_volume_backup_controller.go:280" error.function="github.com/vmware-tanzu/velero/pkg/controller.(*podVolumeBackupController).processBackup" logSource="pkg/controller/pod_volume_backup_controller.go:280" name=<backup_id> namespace=openshift-migration

    You can resolve this issue by creating a supplemental group for Restic and adding the group ID to the MigrationController CR manifest.

    Procedure

    1. Create a supplemental group for Restic on the NFS storage.

    2. Set the setgid bit on the NFS directories so that group ownership is inherited.

    3. Add the restic_supplemental_groups parameter to the MigrationController CR manifest on the source and target clusters:

      1. spec:
      2. restic_supplemental_groups: <group_id> (1)
      1Specify the supplemental group ID.
    4. Wait for the Restic pods to restart so that the changes are applied.

    You can roll back a migration by using the MTC web console or the CLI.

    You can also roll back a migration manually.

    Rolling back a migration by using the MTC web console

    You can roll back a migration by using the Migration Toolkit for Containers (MTC) web console.

    The following resources remain in the migrated namespaces for debugging after a failed direct volume migration (DVM):

    • Config maps (source and destination clusters)

    • Secret CRs (source and destination clusters)

    • Rsync CRs (source cluster)

    These resources do not affect rollback. You can delete them manually.

    If you later run the same migration plan successfully, the resources from the failed migration are deleted automatically.

    If your application was stopped during a failed migration, you must roll back the migration to prevent data corruption in the persistent volume.

    Rollback is not required if the application was not stopped during migration because the original application is still running on the source cluster.

    Procedure

    1. In the MTC web console, click Migration plans.

    2. Click the Options menu kebab beside a migration plan and select Rollback under Migration.

    3. Click Rollback and wait for rollback to complete.

      In the migration plan details, Rollback succeeded is displayed.

    4. Verify that rollback was successful in the OKD web console of the source cluster:

      1. Click HomeProjects.

      2. Click the migrated project to view its status.

      3. In the Routes section, click Location to verify that the application is functioning, if applicable.

      4. Click WorkloadsPods to verify that the pods are running in the migrated namespace.

      5. Click StoragePersistent volumes to verify that the migrated persistent volume is correctly provisioned.

    Rolling back a migration from the command line interface

    You can roll back a migration by creating a MigMigration custom resource (CR) from the command line interface.

    The following resources remain in the migrated namespaces for debugging after a failed direct volume migration (DVM):

    • Config maps (source and destination clusters)

    • Secret CRs (source and destination clusters)

    • Rsync CRs (source cluster)

    These resources do not affect rollback. You can delete them manually.

    If you later run the same migration plan successfully, the resources from the failed migration are deleted automatically.

    If your application was stopped during a failed migration, you must roll back the migration to prevent data corruption in the persistent volume.

    Rollback is not required if the application was not stopped during migration because the original application is still running on the source cluster.

    Procedure

    1. Create a MigMigration CR based on the following example:

      1. $ cat << EOF | oc apply -f -
      2. apiVersion: migration.openshift.io/v1alpha1
      3. kind: MigMigration
      4. metadata:
      5. labels:
      6. controller-tools.k8s.io: "1.0"
      7. name: <migmigration>
      8. namespace: openshift-migration
      9. spec:
      10. ...
      11. rollback: true
      12. ...
      13. migPlanRef:
      14. name: <migplan> (1)
      15. namespace: openshift-migration
      16. EOF
      1Specify the name of the associated MigPlan CR.
    2. In the MTC web console, verify that the migrated project resources have been removed from the target cluster.

    3. Verify that the migrated project resources are present in the source cluster and that the application is running.

    Rolling back a migration manually

    You can roll back a failed migration manually by deleting the stage pods and unquiescing the application.

    If you run the same migration plan successfully, the resources from the failed migration are deleted automatically.

    The following resources remain in the migrated namespaces after a failed direct volume migration (DVM):

    • Config maps (source and destination clusters)

    • Secret CRs (source and destination clusters)

    • Rsync CRs (source cluster)

    These resources do not affect rollback. You can delete them manually.

    Procedure

    1. Delete the stage pods on all clusters:

      1. $ oc delete $(oc get pods -l migration.openshift.io/is-stage-pod -n <namespace>) (1)
      1Namespaces specified in the MigPlan CR.
    2. Unquiesce the application on the source cluster by scaling the replicas to their premigration number:

      1. $ oc scale deployment <deployment> --replicas=<premigration_replicas>

      The migration.openshift.io/preQuiesceReplicas annotation in the Deployment CR displays the premigration number of replicas:

      1. apiVersion: extensions/v1beta1
      2. kind: Deployment
      3. metadata:
      4. annotations:
      5. deployment.kubernetes.io/revision: "1"
      6. migration.openshift.io/preQuiesceReplicas: "1"
    3. Verify that the application pods are running on the source cluster:

      1. $ oc get pod -n <namespace>

    You can uninstall the Migration Toolkit for Containers (MTC) and delete its resources to clean up the cluster.

    Deleting the velero CRDs removes Velero from the cluster.

    Prerequisites

    • You must be logged in as a user with cluster-admin privileges.

    Procedure

    1. Delete the MigrationController custom resource (CR) on all clusters:

      1. $ oc delete migrationcontroller <migration_controller>
    2. Uninstall the Migration Toolkit for Containers Operator on OKD 4 by using the Operator Lifecycle Manager.

    3. Delete cluster-scoped resources on all clusters by running the following commands:

      • migration custom resource definitions (CRDs):

        1. $ oc delete $(oc get crds -o name | grep 'migration.openshift.io')
      • velero CRDs:

        1. $ oc delete $(oc get crds -o name | grep 'velero')
      • migration cluster roles:

        1. $ oc delete $(oc get clusterroles -o name | grep 'migration.openshift.io')
      • migration-operator cluster role:

        1. $ oc delete clusterrole migration-operator
      • velero cluster roles:

        1. $ oc delete $(oc get clusterroles -o name | grep 'velero')
      • migration cluster role bindings:

        1. $ oc delete $(oc get clusterrolebindings -o name | grep 'migration.openshift.io')
      • velero cluster role bindings:

    Additional resources for uninstalling MTC