google cloud storage - move file from one folder to another - by using python

  • Last Update :
  • Techknowledgy :

Here's a function I use when moving blobs between directories within the same bucket or to a different bucket.

from google.cloud
import storage
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path_to_your_creds.json"

def mv_blob(bucket_name, blob_name, new_bucket_name, new_blob_name):
   ""
"
Function
for moving files between directories or buckets.it will use GCP 's copy 

function then delete the blob from the old location.

inputs
-- -- -
bucket_name: name of bucket
blob_name: str, name of file
ex.
'data/some_location/file_name'
new_bucket_name: name of bucket(can be same as original
      if we 're just moving around directories)
      new_blob_name: str, name of file in new directory in target bucket ex.
      'data/destination/file_name'
      ""
      "
      storage_client = storage.Client() source_bucket = storage_client.get_bucket(bucket_name) source_blob = source_bucket.blob(blob_name) destination_bucket = storage_client.get_bucket(new_bucket_name)

      # copy to new destination new_blob = source_bucket.copy_blob(
         source_blob, destination_bucket, new_blob_name) # delete in old destination source_blob.delete()

      print(f 'File moved from {source_blob} to {new_blob_name}')

Using the google-api-python-client, there is an example on the storage.objects.copy page. After you copy, you can delete the source with storage.objects.delete.

destination_object_resource = {}
req = client.objects().copy(
   sourceBucket = bucket1,
   sourceObject = old_object,
   destinationBucket = bucket2,
   destinationObject = new_object,
   body = destination_object_resource)
resp = req.execute()
print json.dumps(resp, indent = 2)

client.objects().delete(
   bucket = bucket1,
   object = old_object).execute()
def GCP_BUCKET_A_TO_B():
   source_bucket = storage_client.get_bucket("Bucket_A_Name")
filename = [filename.name
   for filename in
   list(source_bucket.list_blobs(prefix = ""))
]
for i in range(0, len(filename)):
   source_blob = source_bucket.blob(filename[i])
destination_bucket = storage_client.get_bucket("Bucket_B_Name")
new_blob = source_bucket.copy_blob(
   source_blob, destination_bucket, filename[i])

Example:

def move(source_uri: str,
      destination_uri: str) - > None:
   ""
"
Move file from source_uri to destination_uri.

   : param source_uri: gs: // - like uri of the source file/directory
   : param destination_uri: gs: // - like uri of the destination file/directory
   : return: None ""
"
cmd = f "gsutil -m mv {source_uri} {destination_uri}"
subprocess.run(cmd)

Suggestion : 2

Manage files in your Google Cloud Storage bucket using the google-cloud-storage Python library.,Manage Files in Google Cloud Storage With Python,Let's start coding, shall we? Make sure the google-cloud-storage library is installed on your machine with pip3 install google-cloud-storage.,If you're a returning reader, you may recall we actually touched on google-cloud-storage in a previous tutorial, when we walked through managing files in GCP with Python. If you feel comfotable setting up a GCP Storage bucket on your own, this first bit might get a little repetitive.

I'm going to set up our project with a config.py file containing relevant information we'll need to work with:

""
"Google Cloud Storage Configuration."
""
from os
import environ

# Google Cloud Storage
bucketName = environ.get('GCP_BUCKET_NAME')
bucketFolder = environ.get('GCP_BUCKET_FOLDER_NAME')

# Data
localFolder = environ.get('LOCAL_FOLDER')

With that done, we can start our script by importing these values:

""
"Programatically interact with a Google Cloud Storage bucket."
""
from google.cloud
import storage
from config
import bucketName, localFolder, bucketFolder

   ...

Before we do anything, we need to create an object representing our bucket. I'm creating a global variable named bucket. This is created by calling the get_bucket() method on our storage client and passing the name of our bucket:

""
"Programatically interact with a Google Cloud Storage bucket."
""
from google.cloud
import storage
from config
import bucketName, localFolder, bucketFolder

storage_client = storage.Client()
bucket = storage_client.get_bucket(bucketName)

   ...

We then loop through each file in our array of files. We set the desired destination of each file using bucket.blob(), which accepts the desired file path where our file will live once uploaded to GCP. We then upload the file with blob.upload_from_filename(localFile):

Uploaded['sample_csv.csv', 'sample_text.txt', 'peas.jpg', 'sample_image.jpg'] to "hackers-data"
bucket.

Knowing which files exist in our bucket is obviously important:

def list_files(bucketName):
   ""
"List all files in GCP bucket."
""
files = bucket.list_blobs(prefix = bucketFolder)
fileList = [file.name
   for file in files
   if '.' in file.name
]
return fileList

Suggestion : 3

GCSToGCSOperator allows you to copy one or more files within GCS. The files may be copied between two different buckets or within one bucket. The copying always takes place without taking into account the initial state of the destination bucket.,This operator only deletes objects in the source bucket if the file move option is active. When copying files between two different buckets, this operator never deletes data in the destination bucket.,These operators do not control the copying process locally, but uses Google resources, which allows them to perform this task faster and more economically. The economic effects are especially prominent when Airflow is not hosted in Google Cloud, because these operators reduce egress traffic.,Below are examples of using the GCSToGCSOperator to copy a single file, to copy multiple files with a wild card, to copy multiple files, to move a single file, and to move multiple files.

copy_single_file = GCSToGCSOperator(
   task_id = "copy_single_gcs_file",
   source_bucket = BUCKET_NAME_SRC,
   source_object = OBJECT_1,
   destination_bucket = BUCKET_NAME_DST, # If not supplied the source_bucket value will be used destination_object = "backup_" + OBJECT_1, # If not supplied the source_object value will be used
)
copy_files_with_wildcard = GCSToGCSOperator(
   task_id = "copy_files_with_wildcard",
   source_bucket = BUCKET_NAME_SRC,
   source_object = "data/*.txt",
   destination_bucket = BUCKET_NAME_DST,
   destination_object = "backup/",
)
copy_files_with_delimiter = GCSToGCSOperator(
   task_id = "copy_files_with_delimiter",
   source_bucket = BUCKET_NAME_SRC,
   source_object = "data/",
   destination_bucket = BUCKET_NAME_DST,
   destination_object = "backup/",
   delimiter = '.txt',
)
copy_files_without_wildcard = GCSToGCSOperator(
   task_id = "copy_files_without_wildcard",
   source_bucket = BUCKET_NAME_SRC,
   source_object = "subdir/",
   destination_bucket = BUCKET_NAME_DST,
   destination_object = "backup/",
)
copy_files_with_list = GCSToGCSOperator(
   task_id = "copy_files_with_list",
   source_bucket = BUCKET_NAME_SRC,
   source_objects = [OBJECT_1, OBJECT_2], # Instead of files each element could be a wildcard expression destination_bucket = BUCKET_NAME_DST,
   destination_object = "backup/",
)
move_single_file = GCSToGCSOperator(
   task_id = "move_single_file",
   source_bucket = BUCKET_NAME_SRC,
   source_object = OBJECT_1,
   destination_bucket = BUCKET_NAME_DST,
   destination_object = "backup_" + OBJECT_1,
   move_object = True,
)

Suggestion : 4

Are there any API function that allow us to move files in Google Cloud Storage from one bucket in another bucket?,The scenario is we want Python to move read files in A bucket to B bucket. I knew that gsutil could do that but not sure Python can support that or not.,How to move data directly from one Google Cloud Storage project to another,Here's a function I use when moving blobs between directories within the same bucket or to a different bucket.

Here's a function I use when moving blobs between directories within the same bucket or to a different bucket.

from google.cloud
import storage
import os

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path_to_your_creds.json"

def mv_blob(bucket_name, blob_name, new_bucket_name, new_blob_name):
   ""
"
Function
for moving files between directories or buckets.it will use GCP 's copy 

function then delete the blob from the old location.

inputs
-- -- -
bucket_name: name of bucket
blob_name: str, name of file
ex.
'data/some_location/file_name'
new_bucket_name: name of bucket(can be same as original
      if we 're just moving around directories)
      new_blob_name: str, name of file in new directory in target bucket ex.
      'data/destination/file_name'
      ""
      "
      storage_client = storage.Client() source_bucket = storage_client.get_bucket(bucket_name) source_blob = source_bucket.blob(blob_name) destination_bucket = storage_client.get_bucket(new_bucket_name)

      # copy to new destination new_blob = source_bucket.copy_blob(
         source_blob, destination_bucket, new_blob_name) # delete in old destination source_blob.delete()

      print(f 'File moved from {source_blob} to {new_blob_name}')