Operational Defect Database

BugZero updated this defect 46 days ago.

VMware | 93015

Bosh task fails with "Error: Failed to upload blob, code 1, output: 'Error running app - Putting dav blob xxx-xxx-xxx-xxx : Wrong response code: 500; body: <html> <head><title>500 Internal Server Error</title></head>"

Last update date:

4/4/2024

Affected products:

Tanzu Kubernetes Grid Integrated Edition

Affected releases:

No affected releases provided.

Fixed releases:

No fixed releases provided.

Description:

Symptoms

When updating a Kubernetes cluster’s configuration more likely increasing worker nodes count , bosh task fails with the following error. $ tkgi update-cluster &lt;cluster-name&gt; --num-nodes 4 Update summary for cluster &lt;cluster-name&gt;: Worker Number: 4 Cluster Tags: [{cluster_name &lt;cluster-name&gt;}] Are you sure you want to continue? (y/n): y Use &#39;&lt;cluster-name&gt;&#39; to monitor the state of your cluster $ bosh task Using environment &#39;10.32.36.10&#39; as client &#39;ops_manager&#39; Task 1417873 Task 1417873 | 00:36:22 | Deprecation: Global &#39;properties&#39; are deprecated. Please define &#39;properties&#39; at the job level. Task 1417873 | 00:36:24 | Preparing deployment: Preparing deployment Task 1417873 | 00:36:25 | Warning: DNS address not available for the link provider instance: pivotal-container-service/fb7126ce-fd3e-4f9a-a79c-24bbf1342a8d Task 1417873 | 00:36:25 | Warning: DNS address not available for the link provider instance: pivotal-container-service/fb7126ce-fd3e-4f9a-a79c-24bbf1342a8d Task 1417873 | 00:36:25 | Warning: DNS address not available for the link provider instance: pivotal-container-service/fb7126ce-fd3e-4f9a-a79c-24bbf1342a8d Task 1417873 | 00:36:43 | Preparing deployment: Preparing deployment (00:00:19) Task 1417873 | 00:36:43 | Preparing deployment: Rendering templates (00:00:11) Task 1417873 | 00:36:54 | Preparing package compilation: Finding packages to compile (00:00:00) Task 1417873 | 00:36:55 | Creating missing vms: worker/b233cdf7-fc20-41f5-b495-37d37060158a (1) Task 1417873 | 00:36:55 | Creating missing vms: worker/2b4cf530-1cf9-4214-a012-790e4e85c9f0 (3) Task 1417873 | 00:36:55 | Creating missing vms: worker/5cd0b491-06d7-4df4-8da3-c158a7c9f6b6 (2) Task 1417873 | 00:39:16 | Creating missing vms: worker/b233cdf7-fc20-41f5-b495-37d37060158a (1) (00:02:21) Task 1417873 | 00:39:29 | Creating missing vms: worker/2b4cf530-1cf9-4214-a012-790e4e85c9f0 (3) (00:02:34) Task 1417873 | 00:39:30 | Creating missing vms: worker/5cd0b491-06d7-4df4-8da3-c158a7c9f6b6 (2) (00:02:35) Task 1417873 | 00:39:30 | Error: Failed to upload blob, code 1, output: &#39;Error running app - Putting dav blob b27061fc-5cad-4639-b469-ecd180b90036: Wrong response code: 500; body: &lt;html&gt; &lt;head&gt;&lt;title&gt;500 Internal Server Error&lt;/title&gt;&lt;/head&gt; &lt;body&gt; &lt;center&gt;&lt;h1&gt;500 Internal Server Error&lt;/h1&gt;&lt;/center&gt; &lt;hr&gt;&lt;center&gt;nginx&lt;/center&gt; &lt;/body&gt; &lt;/html&gt; &#39;, error: &#39;&#39; Task 1417873 Started Tue Jun 20 00:36:22 UTC 2023 Task 1417873 Finished Tue Jun 20 00:39:30 UTC 2023 Task 1417873 Duration 00:03:08 Task 1417873 error Capturing task &#39;1417873&#39; output: Expected task &#39;1417873&#39; to succeed but state is &#39;error&#39; Exit code 1

Cause

This error may occurs when multiple tasks are timing out or failing and too many BOSH tasks are queued for the Director. $ bosh tasks Using environment &#39;x.x.x.x&#39; as client &#39;ops_manager&#39; ID State Started At Finished At User Deployment Description Result 262506 queued - - pivotal-container-service-axxx service-instance_xxxx retrieve vm-stats - 262505 queued - - pivotal-container-service-axxx service-instance_xxxx retrieve vm-stats - 262503 queued - - ops_manager service-instance_xxxx ssh: setup:{&quot;ids&quot;=&gt;[&quot;ff4d7cce-2f2d-468e-ba90-246a33a1b8bb&quot;], &quot;indexes&quot;=&gt;[&quot;ff4d7cce-2f2d-468e-ba90-246a33a1b8bb&quot;], &quot;job&quot;=&gt;&quot;worker&quot;} - 262502 queued - - ops_manager service-instance_xxxx ssh: setup:{&quot;ids&quot;=&gt;[&quot;fe2c2f36-8cee-40af-9b6c-84c650776405&quot;], &quot;indexes&quot;=&gt;[&quot;fe2c2f36-8cee-40af-9b6c-84c650776405&quot;], &quot;job&quot;=&gt;&quot;worker&quot;} - 262501 queued - - ops_manager service-instance_xxxx ssh: setup:{&quot;ids&quot;=&gt;[&quot;fa9b3b8d-9f02-41cb-a945-c7536d4d2e3d&quot;], &quot;indexes&quot;=&gt;[&quot;fa9b3b8d-9f02-41cb-a945-c7536d4d2e3d&quot;], &quot;job&quot;=&gt;&quot;worker&quot;} - 262500 queued - - ops_manager service-instance_xxxx ssh: setup:{&quot;ids&quot;=&gt;[&quot;f9ca89c2-0396-41ea-8986-a303ea41e2e3&quot;], &quot;indexes&quot;=&gt;[&quot;f9ca89c2-0396-41ea-8986-a303ea41e2e3&quot;], &quot;job&quot;=&gt;&quot;worker&quot;} - ... ... ... ... ... ... ... ... ... ... ... ... - 472 tasks Succeeded You can see scheduled_events_cleanup count , snapshot_deployment count , ssh count , vms count is more than 2000 . type: cck_scan_and_fix count: 140&quot;, &quot; type: delete_artifacts count: 16&quot;, &quot;type: delete_deployment count: 8&quot;, &quot;type: fetch_logs count: 12&quot;, &quot;type: run_errand count: 700&quot;, &quot;type: scheduled_dns_blobs_cleanup count: 612&quot;, &quot;type: scheduled_events_cleanup count: 2151&quot;, &quot;type: scheduled_orphaned_disk_cleanup count: 277&quot;, &quot;type: scheduled_task_cleanup count: 113&quot;, &quot;type: snapshot_deployment count: 2021&quot;, &quot;type: snapshot_deployments count: 707&quot;, &quot;type: snapshot_self count: 707&quot;, &quot;type: ssh count: 2772&quot;, &quot;type: update_deployment count: 289&quot;, &quot;type: update_release count: 634&quot;, &quot;type: update_stemcell count: 1&quot;, &quot;type: vms count: 3711&quot;] irb(main):003:0&gt;

Resolution

WARNING: Make sure you are absolutely certain that the queued tasks are not affecting any ongoing deployments. If you have a currently running deployment DO NOT CONTINUE. Contact Tanzu Support for assistance. The steps outlines in this article cancel all queued tasks and if you have a running deployment, it may leave it in an inconsistent and potentially broken state. Again, do not continue if you have a deployment that is in progress.If using a version of Ops Manager 2.7+, use the following command to cancel all queued BOSH tasks : bosh cancel-tasks -s=queued Once BOSH tasks are cancelled, retry cluster scaling operation with following command : tkgi update-cluster &lt;cluster-name&gt; --num-nodes 4 NOTE : If the previous command (bosh cancel-tasks -s=queued ) is not available, then follow the steps mentioned in below KB article to cancel queued tasks. https://community.pivotal.io/s/article/How-to-Cancel-All-Queued-BOSH-Tasks-Using-director-ctl?language=en_US

Related Information

How to clean up stale BOSH tasks history from BOSH Director consolehttps://community.pivotal.io/s/article/How-to-clean-up-stale-BOSH-tasks-history-from-console?language=en_USHow to cancel all queued BOSH tasks using director_ctl in Operations Managerhttps://community.pivotal.io/s/article/How-to-Cancel-All-Queued-BOSH-Tasks-Using-director-ctl?language=en_US

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Unavailable

Learn More

Search:

...