gsutil: exploring cloud storage

In [2]:
#some help
!gsutil --help

Usage: gsutil [-D] [-DD] [-h header]... [-m] [-o] [-q] [command [opts...] args...]
Available commands:
  acl             Get, set, or change bucket and/or object ACLs
  cat             Concatenate object content to stdout
  compose         Concatenate a sequence of objects into a new composite object.
  config          Obtain credentials and create configuration file
  cors            Get or set a CORS JSON document for one or more buckets
  cp              Copy files and objects
  defacl          Get, set, or change default ACL on buckets
  defstorageclass Get or set the default storage class on buckets
  du              Display object size usage
  hash            Calculate file hashes
  help            Get help about commands and topics
  iam             Get, set, or change bucket and/or object IAM permissions.
  label           Get, set, or change the label configuration of a bucket.
  lifecycle       Get or set lifecycle configuration for a bucket
  logging         Configure or retrieve logging on buckets
  ls              List providers, buckets, or objects
  mb              Make buckets
  mv              Move/rename objects and/or subdirectories
  notification    Configure object change notification
  perfdiag        Run performance diagnostic
  rb              Remove buckets
  rewrite         Rewrite objects
  rm              Remove objects
  rsync           Synchronize content of two buckets/directories
  setmeta         Set metadata on already uploaded objects
  signurl         Create a signed url
  stat            Display object status
  test            Run gsutil tests
  update          Update to the latest gsutil release
  version         Print version info about gsutil
  versioning      Enable or suspend versioning for one or more buckets
  web             Set a main page and/or error page for one or more buckets

Additional help topics:
  acls            Working With Access Control Lists
  anon            Accessing Public Data Without Credentials
  apis            Cloud Storage APIs
  crc32c          CRC32C and Installing crcmod
  creds           Credential Types Supporting Various Use Cases
  csek            Supplying Your Own Encryption Keys
  dev             Contributing Code to gsutil
  encoding        Filename encoding and interoperability problems
  metadata        Working With Object Metadata
  naming          Object and Bucket Naming
  options         Top-Level Command-Line Options
  prod            Scripting Production Transfers
  projects        Working With Projects
  retries         Retry Handling Strategy
  security        Security and Privacy Considerations
  subdirs         How Subdirectories Work
  support         Google Cloud Storage Support
  throttling      Throttling gsutil
  versions        Object Versioning and Concurrency Control
  wildcards       Wildcard Names

Use gsutil help <command or topic> for detailed help.

In [30]:
#list my buckets 
#(projectId has been set in the gcloud config 
# you can use the parameter -p projectId instead )
!gsutil ls


In [ ]:
#more details
!gsutil ls -L

let's create a new bucket

In [35]:
#some help
!gsutil mb --help

  mb - Make buckets


  gsutil mb [-c class] [-l location] [-p proj_id] url...

  The mb command creates a new bucket. Google Cloud Storage has a single
  namespace, so you are not allowed to create a bucket with a name already
  in use by another user. You can, however, carve out parts of the bucket name
  space corresponding to your company's domain name (see "gsutil help naming").

  If you don't specify a project ID using the -p option, the bucket is created
  using the default project ID specified in your gsutil configuration file
  (see "gsutil help config"). For more details about projects see "gsutil help

  The -c and -l options specify the storage class and location, respectively,
  for the bucket. Once a bucket is created in a given location and with a
  given storage class, it cannot be moved to a different location, and the
  storage class cannot be changed. Instead, you would need to create a new
  bucket and move the data over and then delete the original bucket.

  You can specify one of the `storage classes
  <>`_ for a bucket
  with the -c option.


    gsutil mb -c nearline gs://some-bucket

  See online documentation for
  `pricing <>`_ and
  `SLA <>`_ details.

  If you don't specify a -c option, the bucket is created with the
  default storage class Standard Storage, which is equivalent to Multi-Regional
  Storage or Regional Storage, depending on whether the bucket was created in
  a multi-regional location or regional location, respectively.

  You can specify one of the 'available locations
  <>`_ for a bucket
  with the -l option.


    gsutil mb -l asia gs://some-bucket

    gsutil mb -c regional -l us-east1 gs://some-bucket

  If you don't specify a -l option, the bucket is created in the default
  location (US).

  -c class          Specifies the default storage class. Default is "Standard".

  -l location       Can be any multi-regional or regional location. See
                    for a discussion of this distinction. Default is US.
                    Locations are case insensitive.

  -p proj_id        Specifies the project ID under which to create the bucket.

  -s class          Same as -c.

In [36]:
#creating a new bucket class regional en region wurope-west2
!gsutil mb -c regional -l europe-west2 gs://a-brand-new-bucket-toto/

Creating gs://a-brand-new-bucket-toto/...
ServiceException: 409 Bucket a-brand-new-bucket-toto already exists.

In [37]:
!gsutil ls


 everything is better with labels :-)

In [8]:
#some help
!gsutil label --help

  label - Get, set, or change the label configuration of a bucket.

  gsutil label set label-json-file url...
  gsutil label get url
  gsutil label ch <label_modifier>... url...

  where each <label_modifier> is one of the following forms:

    -l <key>:<value>
    -d <key>

  Gets, sets, or changes the label configuration (also called the tagging
  configuration by other storage providers) of one or more buckets. An example
  label JSON document looks like the following:

      "your_label_key": "your_label_value",
      "your_other_label_key": "your_other_label_value"

  The label command has three sub-commands:

  The "label get" command gets the
  `labels <>`_
  applied to a bucket, which you can save and edit for use with the "label set"

  The "label set" command allows you to set the labels on one or more
  buckets. You can retrieve a bucket's labels using the "label get" command,
  save the output to a file, edit the file, and then use the "label set"
  command to apply those labels to the specified bucket(s). For

    gsutil label get gs://bucket > labels.json

  Make changes to labels.json, such as adding an additional label, then:

    gsutil label set labels.json gs://example-bucket

  Note that you can set these labels on multiple buckets at once:

    gsutil label set labels.json gs://bucket-foo gs://bucket-bar

  The "label ch" command updates a bucket's label configuration, applying the
  label changes specified by the -l and -d flags. You can specify multiple
  label changes in a single command run; all changes will be made atomically to
  each bucket.

  Examples for "ch" sub-command:

  Add the label "key-foo:value-bar" to the bucket "example-bucket":

    gsutil label ch -l key-foo:value-bar gs://example-bucket

  Change the above label to have a new value:

    gsutil label ch -l key-foo:other-value gs://example-bucket

  Add a new label and delete the old one from above:

    gsutil label ch -l new-key:new-value -d key-foo gs://example-bucket

  The "ch" sub-command has the following options

    -l          Add or update a label with the specified key and value.

    -d          Remove the label with the specified key.

In [38]:
#setting label
!gsutil label ch -l env:test gs://a-brand-new-bucket-toto/

Setting label configuration on gs://a-brand-new-bucket-toto/...

In [39]:
#getting the labels
!gsutil label get gs://a-brand-new-bucket-toto/

  "env": "test"

Upload a nice foto there


In [40]:
#what does the help say?
!gsutil cp --help

  cp - Copy files and objects


  gsutil cp [OPTION]... src_url dst_url
  gsutil cp [OPTION]... src_url... dst_url
  gsutil cp [OPTION]... -I dst_url

  The gsutil cp command allows you to copy data between your local file
  system and the cloud, copy data within the cloud, and copy data between
  cloud storage providers. For example, to copy all text files from the
  local directory to a bucket you could do:

    gsutil cp *.txt gs://my-bucket

  Similarly, you can download text files from a bucket by doing:

    gsutil cp gs://my-bucket/*.txt .

  If you want to copy an entire directory tree you need to use the -r option:

    gsutil cp -r dir gs://my-bucket

  If you have a large number of files to transfer you might want to use the
  gsutil -m option, to perform a parallel (multi-threaded/multi-processing)

    gsutil -m cp -r dir gs://my-bucket

  You can pass a list of URLs (one per line) to copy on stdin instead of as
  command line arguments by using the -I option. This allows you to use gsutil
  in a pipeline to upload or download files / objects as generated by a program,
  such as:

    some_program | gsutil -m cp -I gs://my-bucket


    some_program | gsutil -m cp -I ./download_dir

  The contents of stdin can name files, cloud URLs, and wildcards of files
  and cloud URLs.

  The gsutil cp command strives to name objects in a way consistent with how
  Linux cp works, which causes names to be constructed in varying ways depending
  on whether you're performing a recursive directory copy or copying
  individually named objects; and whether you're copying to an existing or
  non-existent directory.

  When performing recursive directory copies, object names are constructed that
  mirror the source directory structure starting at the point of recursive
  processing. For example, if dir1/dir2 contains the file a/b/c then the

    gsutil cp -r dir1/dir2 gs://my-bucket

  will create the object gs://my-bucket/dir2/a/b/c.

  In contrast, copying individually named files will result in objects named by
  the final path component of the source files. For example, again assuming
  dir1/dir2 contains a/b/c, the command:

    gsutil cp dir1/dir2/** gs://my-bucket

  will create the object gs://my-bucket/c.

  The same rules apply for downloads: recursive copies of buckets and
  bucket subdirectories produce a mirrored filename structure, while copying
  individually (or wildcard) named objects produce flatly named files.

  Note that in the above example the '**' wildcard matches all names
  anywhere under dir. The wildcard '*' will match names just one level deep. For
  more details see "gsutil help wildcards".

  There's an additional wrinkle when working with subdirectories: the resulting
  names depend on whether the destination subdirectory exists. For example,
  if gs://my-bucket/subdir exists as a subdirectory, the command:

    gsutil cp -r dir1/dir2 gs://my-bucket/subdir

  will create the object gs://my-bucket/subdir/dir2/a/b/c. In contrast, if
  gs://my-bucket/subdir does not exist, this same gsutil cp command will create
  the object gs://my-bucket/subdir/a/b/c.

  Note: If you use the
  `Google Cloud Platform Console <>`_
  to create folders, it does so by creating a "placeholder" object that ends
  with a "/" character. gsutil skips these objects when downloading from the
  cloud to the local file system, because attempting to create a file that
  ends with a "/" is not allowed on Linux and MacOS. Because of this, it is
  recommended that you not create objects that end with "/" (unless you don't
  need to be able to download such objects using gsutil).

  You can use gsutil to copy to and from subdirectories by using a command

    gsutil cp -r dir gs://my-bucket/data

  This will cause dir and all of its files and nested subdirectories to be
  copied under the specified destination, resulting in objects with names like
  gs://my-bucket/data/dir/a/b/c. Similarly you can download from bucket
  subdirectories by using a command like:

    gsutil cp -r gs://my-bucket/data dir

  This will cause everything nested under gs://my-bucket/data to be downloaded
  into dir, resulting in files with names like dir/data/a/b/c.

  Copying subdirectories is useful if you want to add data to an existing
  bucket directory structure over time. It's also useful if you want
  to parallelize uploads and downloads across multiple machines (potentially
  reducing overall transfer time compared with simply running gsutil -m
  cp on one machine). For example, if your bucket contains this structure:


  you could perform concurrent downloads across 3 machines by running these
  commands on each machine, respectively:

    gsutil -m cp -r gs://my-bucket/data/result_set_[0-3]* dir
    gsutil -m cp -r gs://my-bucket/data/result_set_[4-6]* dir
    gsutil -m cp -r gs://my-bucket/data/result_set_[7-9]* dir

  Note that dir could be a local directory on each machine, or it could be a
  directory mounted off of a shared file server; whether the latter performs
  acceptably will depend on a number of factors, so we recommend experimenting
  to find out what works best for your computing environment.

  If both the source and destination URL are cloud URLs from the same
  provider, gsutil copies data "in the cloud" (i.e., without downloading
  to and uploading from the machine where you run gsutil). In addition to
  the performance and cost advantages of doing this, copying in the cloud
  preserves metadata (like Content-Type and Cache-Control). In contrast,
  when you download data from the cloud it ends up in a file, which has
  no associated metadata. Thus, unless you have some way to hold on to
  or re-create that metadata, downloading to a file will not retain the

  Copies spanning locations and/or storage classes cause data to be rewritten
  in the cloud, which may take some time (but still will be faster than
  downloading and re-uploading). Such operations can be resumed with the same
  command if they are interrupted, so long as the command parameters are

  Note that by default, the gsutil cp command does not copy the object
  ACL to the new object, and instead will use the default bucket ACL (see
  "gsutil help defacl"). You can override this behavior with the -p
  option (see OPTIONS below).

  One additional note about copying in the cloud: If the destination bucket has
  versioning enabled, by default gsutil cp will copy only live versions of the
  source object(s). For example:

    gsutil cp gs://bucket1/obj gs://bucket2

  will cause only the single live version of gs://bucket1/obj to be copied to
  gs://bucket2, even if there are archived versions of gs://bucket1/obj. To also
  copy archived versions, use the -A flag:

    gsutil cp -A gs://bucket1/obj gs://bucket2

  The gsutil -m flag is disallowed when using the cp -A flag, to ensure that
  version ordering is preserved.

  At the end of every upload or download the gsutil cp command validates that
  the checksum it computes for the source file/object matches the checksum
  the service computes. If the checksums do not match, gsutil will delete the
  corrupted object and print a warning message. This very rarely happens, but
  if it does, please contact

  If you know the MD5 of a file before uploading you can specify it in the
  Content-MD5 header, which will cause the cloud storage service to reject the
  upload if the MD5 doesn't match the value computed by the service. For

    % gsutil hash obj
    Hashing     obj:
    Hashes [base64] for obj:
            Hash (crc32c):          lIMoIw==
            Hash (md5):             VgyllJgiiaRAbyUUIqDMmw==

    % gsutil -h Content-MD5:VgyllJgiiaRAbyUUIqDMmw== cp obj gs://your-bucket/obj
    Copying file://obj [Content-Type=text/plain]...
    Uploading   gs://your-bucket/obj:                                182 b/182 B

    If the checksum didn't match the service would instead reject the upload and
    gsutil would print a message like:

    BadRequestException: 400 Provided MD5 hash "VgyllJgiiaRAbyUUIqDMmw=="
    doesn't match calculated MD5 hash "7gyllJgiiaRAbyUUIqDMmw==".

  Even if you don't do this gsutil will delete the object if the computed
  checksum mismatches, but specifying the Content-MD5 header has several

      1. It prevents the corrupted object from becoming visible at all, whereas
      otherwise it would be visible for 1-3 seconds before gsutil deletes it.

      2. If an object already exists with the given name, specifying the
      Content-MD5 header will cause the existing object never to be replaced,
      whereas otherwise it would be replaced by the corrupted object and then
      deleted a few seconds later.

      3. It will definitively prevent the corrupted object from being left in
      the cloud, whereas the gsutil approach of deleting after the upload
      completes could fail if (for example) the gsutil process gets ^C'd
      between upload and deletion request.

      4. It supports a customer-to-service integrity check handoff. For example,
      if you have a content production pipeline that generates data to be
      uploaded to the cloud along with checksums of that data, specifying the
      MD5 computed by your content pipeline when you run gsutil cp will ensure
      that the checksums match all the way through the process (e.g., detecting
      if data gets corrupted on your local disk between the time it was written
      by your content pipeline and the time it was uploaded to GCS).

  Note: The Content-MD5 header is ignored for composite objects, because such
  objects only have a CRC32C checksum.

  The cp command will retry when failures occur, but if enough failures happen
  during a particular copy or delete operation the cp command will skip that
  object and move on. At the end of the copy run if any failures were not
  successfully retried, the cp command will report the count of failures, and
  exit with non-zero status.

  Note that there are cases where retrying will never succeed, such as if you
  don't have write permission to the destination bucket or if the destination
  path for some objects is longer than the maximum allowed length.

  For more details about gsutil's retry handling, please see
  "gsutil help retries".

  gsutil automatically performs a resumable upload whenever you use the cp
  command to upload an object that is larger than 8 MiB. You do not need to
  specify any special command line options to make this happen. If your upload
  is interrupted you can restart the upload by running the same cp command that
  you ran to start the upload. Until the upload has completed successfully, it
  will not be visible at the destination object and will not replace any
  existing object the upload is intended to overwrite. However, see the section
  on PARALLEL COMPOSITE UPLOADS, which may leave temporary component objects in
  place during the upload process.

  Similarly, gsutil automatically performs resumable downloads (using standard
  HTTP Range GET operations) whenever you use the cp command, unless the
  destination is a stream. In this case, a partially downloaded temporary file
  will be visible in the destination directory. Upon completion, the original
  file is deleted and overwritten with the downloaded contents.

  Resumable uploads and downloads store state information in files under
  ~/.gsutil, named by the destination object or file. If you attempt to resume a
  transfer from a machine with a different directory, the transfer will start
  over from scratch.

  See also "gsutil help prod" for details on using resumable transfers
  in production.

  Use '-' in place of src_url or dst_url to perform a streaming
  transfer. For example:

    long_running_computation | gsutil cp - gs://my-bucket/obj

  Streaming uploads using the JSON API (see "gsutil help apis") are buffered in
  memory part-way back into the file and can thus retry in the event of network
  or service problems.

  Streaming transfers using the XML API do not support resumable
  uploads/downloads. If you have a large amount of data to upload (say, more
  than 100 MiB) it is recommended that you write the data to a local file and
  then copy that file to the cloud rather than streaming it (and similarly for
  large downloads).

  WARNING: When performing streaming transfers gsutil does not compute a
  checksum of the uploaded or downloaded data. Therefore, we recommend that
  users either perform their own validation of the data or use non-streaming
  transfers (which perform integrity checking automatically).

  gsutil uses HTTP Range GET requests to perform "sliced" downloads in parallel
  when downloading large objects from Google Cloud Storage. This means that disk
  space for the temporary download destination file will be pre-allocated and
  byte ranges (slices) within the file will be downloaded in parallel. Once all
  slices have completed downloading, the temporary file will be renamed to the
  destination file. No additional local disk space is required for this

  This feature is only available for Google Cloud Storage objects because it
  requires a fast composable checksum (CRC32C) that can be used to verify the
  data integrity of the slices. And because it depends on CRC32C, using sliced
  object downloads also requires a compiled crcmod (see "gsutil help crcmod") on
  the machine performing the download. If compiled crcmod is not available,
  a non-sliced object download will instead be performed.

  Note: since sliced object downloads cause multiple writes to occur at various
  locations on disk, this mechanism can degrade performance for disks with slow
  seek times, especially for large numbers of slices. While the default number
  of slices is set small to avoid this problem, you can disable sliced object
  download if necessary by setting the "sliced_object_download_threshold"
  variable in the .boto config file to 0.

  gsutil can automatically use
  `object composition <>`_
  to perform uploads in parallel for large, local files being uploaded to Google
  Cloud Storage. If enabled (see below), a large file will be split into
  component pieces that are uploaded in parallel and then composed in the cloud
  (and the temporary components finally deleted). A file can be broken into as
  many as 32 component pieces; until this piece limit is reached, the maximum
  size of each component piece is determined by the variable
  "parallel_composite_upload_component_size," specified in the [GSUtil] section
  of your .boto configuration file (for files that are otherwise too big,
  components are as large as needed to fit into 32 pieces). No additional local
  disk space is required for this operation.

  Using parallel composite uploads presents a tradeoff between upload
  performance and download configuration: If you enable parallel composite
  uploads your uploads will run faster, but someone will need to install a
  compiled crcmod (see "gsutil help crcmod") on every machine where objects are
  downloaded by gsutil or other Python applications. Note that for such uploads,
  crcmod is required for downloading regardless of whether the parallel
  composite upload option is on or not. For some distributions this is easy
  (e.g., it comes pre-installed on MacOS), but in other cases some users have
  found it difficult. Because of this, at present parallel composite uploads are
  disabled by default. Google is actively working with a number of the Linux
  distributions to get crcmod included with the stock distribution. Once that is
  done we will re-enable parallel composite uploads by default in gsutil.

  Warning: Parallel composite uploads should not be used with NEARLINE or
  COLDLINE storage class buckets, because doing so incurs an early deletion
  charge for each component object.

  To try parallel composite uploads you can run the command:

    gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp bigfile gs://your-bucket

  where bigfile is larger than 150 MiB. When you do this notice that the upload
  progress indicator continuously updates for several different uploads at once
  (corresponding to each of the sections of the file being uploaded in
  parallel), until the parallel upload completes. If after trying this you want
  to enable parallel composite uploads for all of your future uploads
  (notwithstanding the caveats mentioned earlier), you can uncomment and set the
  "parallel_composite_upload_threshold" config value in your .boto configuration
  file to this value.

  Note that the crcmod problem only impacts downloads via Python applications
  (such as gsutil). If all users who need to download the data using gsutil or
  other Python applications can install crcmod, or if no Python users will
  need to download your objects, it makes sense to enable parallel composite
  uploads (see above). For example, if you use gsutil to upload video assets,
  and those assets will only ever be served via a Java application, it would
  make sense to enable parallel composite uploads on your machine (there are
  efficient CRC32C implementations available in Java).

  If a parallel composite upload fails prior to composition, re-running the
  gsutil command will take advantage of resumable uploads for the components
  that failed, and the component objects will be deleted after the first
  successful attempt. Any temporary objects that were uploaded successfully
  before gsutil failed will still exist until the upload is completed
  successfully. The temporary objects will be named in the following fashion:

    <random ID>/gsutil/tmp/parallel_composite_uploads/for_details_see/gsutil_help_cp/<hash>

  where <random ID> is a numerical value, and <hash> is an MD5 hash (not related
  to the hash of the contents of the file or object).

  To avoid leaving temporary objects around, you should make sure to check the
  exit status from the gsutil command.  This can be done in a bash script, for
  example, by doing:

    if ! gsutil cp ./local-file gs://your-bucket/your-object; then
      << Code that handles failures >>

  Or, for copying a directory, use this instead:

    if ! gsutil cp -c -L cp.log -r ./dir gs://bucket; then
      << Code that handles failures >>

  One important caveat is that files uploaded using parallel composite uploads
  are subject to a maximum number of components limit. For example, if you
  upload a large file that gets split into 10 components, and try to compose it
  with another object with 1015 components, the operation will fail because it
  exceeds the 1024 component limit. If you wish to compose an object later and the
  component limit is a concern, it is recommended that you disable parallel
  composite uploads for that transfer.

  Also note that an object uploaded using parallel composite uploads will have a
  CRC32C hash, but it will not have an MD5 hash (and because of that, users who
  download the object must have crcmod installed, as noted earlier). For details
  see "gsutil help crc32c".

  Parallel composite uploads can be disabled by setting the
  "parallel_composite_upload_threshold" variable in the .boto config file to 0.

  gsutil writes data to a temporary directory in several cases:

  - when compressing data to be uploaded (see the -z and -Z options)
  - when decompressing data being downloaded (when the data has
    Content-Encoding:gzip, e.g., as happens when uploaded using gsutil cp -z
    or gsutil cp -Z)
  - when running integration tests (using the gsutil test command)

  In these cases it's possible the temp file location on your system that
  gsutil selects by default may not have enough space. If gsutil runs out of
  space during one of these operations (e.g., raising
  "CommandException: Inadequate temp space available to compress <your file>"
  during a gsutil cp -z operation), you can change where it writes these
  temp files by setting the TMPDIR environment variable. On Linux and MacOS
  you can do this either by running gsutil this way:

    TMPDIR=/some/directory gsutil cp ...

  or by adding this line to your ~/.bashrc file and then restarting the shell
  before running gsutil:

    export TMPDIR=/some/directory

  On Windows 7 you can change the TMPDIR environment variable from Start ->
  Computer -> System -> Advanced System Settings -> Environment Variables.
  You need to reboot after making this change for it to take effect. (Rebooting
  is not necessary after running the export command on Linux and MacOS.)


gsutil cp does not support copying special file types such as sockets, device
files, named pipes, or any other non-standard files intended to represent an
operating system resource. You should not run gsutil cp with sources that
include such files (for example, recursively copying the root directory on
Linux that includes /dev ). If you do, gsutil cp may fail or hang.

  -a canned_acl  Sets named canned_acl when uploaded objects created. See
                 "gsutil help acls" for further details.

  -A             Copy all source versions from a source buckets/folders.
                 If not set, only the live version of each source object is
                 copied. Note: this option is only useful when the destination
                 bucket has versioning enabled.

  -c             If an error occurs, continue to attempt to copy the remaining
                 files. If any copies were unsuccessful, gsutil's exit status
                 will be non-zero even if this flag is set. This option is
                 implicitly set when running "gsutil -m cp...". Note: -c only
                 applies to the actual copying operation. If an error occurs
                 while iterating over the files in the local directory (e.g.,
                 invalid Unicode file name) gsutil will print an error message
                 and abort.

  -D             Copy in "daisy chain" mode, i.e., copying between two buckets
                 by hooking a download to an upload, via the machine where
                 gsutil is run. This stands in contrast to the default, where
                 data are copied between two buckets "in the cloud", i.e.,
                 without needing to copy via the machine where gsutil runs.

                 By default, a "copy in the cloud" when the source is a
                 composite object will retain the composite nature of the
                 object. However, Daisy chain mode can be used to change a
                 composite object into a non-composite object. For example:

                     gsutil cp -D -p gs://bucket/obj gs://bucket/obj_tmp
                     gsutil mv -p gs://bucket/obj_tmp gs://bucket/obj

                 Note: Daisy chain mode is automatically used when copying
                 between providers (e.g., to copy data from Google Cloud Storage
                 to another provider).

  -e             Exclude symlinks. When specified, symbolic links will not be

  -I             Causes gsutil to read the list of files or objects to copy from
                 stdin. This allows you to run a program that generates the list
                 of files to upload/download.

  -L <file>      Outputs a manifest log file with detailed information about
                 each item that was copied. This manifest contains the following
                 information for each item:

                 - Source path.
                 - Destination path.
                 - Source size.
                 - Bytes transferred.
                 - MD5 hash.
                 - UTC date and time transfer was started in ISO 8601 format.
                 - UTC date and time transfer was completed in ISO 8601 format.
                 - Upload id, if a resumable upload was performed.
                 - Final result of the attempted transfer, success or failure.
                 - Failure details, if any.

                 If the log file already exists, gsutil will use the file as an
                 input to the copy process, and will also append log items to
                 the existing file. Files/objects that are marked in the
                 existing log file as having been successfully copied (or
                 skipped) will be ignored. Files/objects without entries will be
                 copied and ones previously marked as unsuccessful will be
                 retried. This can be used in conjunction with the -c option to
                 build a script that copies a large number of objects reliably,
                 using a bash script like the following:

                   until gsutil cp -c -L cp.log -r ./dir gs://bucket; do
                     sleep 1

                 The -c option will cause copying to continue after failures
                 occur, and the -L option will allow gsutil to pick up where it
                 left off without duplicating work. The loop will continue
                 running as long as gsutil exits with a non-zero status (such a
                 status indicates there was at least one failure during the
                 gsutil run).

                 Note: If you're trying to synchronize the contents of a
                 directory and a bucket (or two buckets), see
                 "gsutil help rsync".

  -n             No-clobber. When specified, existing files or objects at the
                 destination will not be overwritten. Any items that are skipped
                 by this option will be reported as being skipped. This option
                 will perform an additional GET request to check if an item
                 exists before attempting to upload the data. This will save
                 retransmitting data, but the additional HTTP requests may make
                 small object transfers slower and more expensive.

  -p             Causes ACLs to be preserved when copying in the cloud. Note
                 that this option has performance and cost implications when
                 using  the XML API, as it requires separate HTTP calls for
                 interacting with ACLs. (There are no such performance or cost
                 implications when using the -p option with the JSON API.) The
                 performance issue can be mitigated to some degree by using
                 gsutil -m cp to cause parallel copying. Note that this option
                 only works if you have OWNER access to all of the objects that
                 are copied.

                 You can avoid the additional performance and cost of using
                 cp -p if you want all objects in the destination bucket to end
                 up with the same ACL by setting a default object ACL on that
                 bucket instead of using cp -p. See "gsutil help defacl".

                 Note that it's not valid to specify both the -a and -p options

  -P             Causes POSIX attributes to be preserved when objects are
                 copied. With this feature enabled, gsutil cp will copy fields
                 provided by stat. These are the user ID of the owner, the group
                 ID of the owning group, the mode (permissions) of the file, and
                 the access/modification time of the file. For downloads, these
                 attributes will only be set if the source objects were uploaded
                 with this flag enabled.

                 On Windows, this flag will only set and restore access time and
                 modification time. This is because Windows doesn't have a
                 notion of POSIX uid/gid/mode.

  -R, -r         The -R and -r options are synonymous. Causes directories,
                 buckets, and bucket subdirectories to be copied recursively.
                 If you neglect to use this option for an upload, gsutil will
                 copy any files it finds and skip any directories. Similarly,
                 neglecting to specify this option for a download will cause
                 gsutil to copy any objects at the current bucket directory
                 level, and skip any subdirectories.

  -s <class>     The storage class of the destination object(s). If not
                 specified, the default storage class of the destination bucket
                 is used. Not valid for copying to non-cloud destinations.

  -U             Skip objects with unsupported object types instead of failing.
                 Unsupported object types are Amazon S3 Objects in the GLACIER
                 storage class.

  -v             Requests that the version-specific URL for each uploaded object
                 be printed. Given this URL you can make future upload requests
                 that are safe in the face of concurrent updates, because Google
                 Cloud Storage will refuse to perform the update if the current
                 object version doesn't match the version-specific URL. See
                 "gsutil help versions" for more details.

  -z <ext,...>   Applies gzip content-encoding to file uploads with the given
                 extensions. This is useful when uploading files with
                 compressible content (such as .js, .css, or .html files)
                 because it saves network bandwidth and space in Google Cloud
                 Storage, which in turn reduces storage costs.

                 When you specify the -z option, the data from your files is
                 compressed before it is uploaded, but your actual files are
                 left uncompressed on the local disk. The uploaded objects
                 retain the Content-Type and name of the original files but are
                 given a Content-Encoding header with the value "gzip" to
                 indicate that the object data stored are compressed on the
                 Google Cloud Storage servers.

                 For example, the following command:

                   gsutil cp -z html -a public-read cattypes.html gs://mycats

                 will do all of the following:

                 - Upload as the object gs://mycats/cattypes.html (cp command)
                 - Set the Content-Type to text/html (based on file extension)
                 - Compress the data in the file cattypes.html (-z option)
                 - Set the Content-Encoding to gzip (-z option)
                 - Set the ACL to public-read (-a option)
                 - If a user tries to view cattypes.html in a browser, the
                   browser will know to uncompress the data based on the
                   Content-Encoding header, and to render it as HTML based on
                   the Content-Type header.

                 Note that if you download an object with Content-Encoding:gzip
                 gsutil will decompress the content before writing the local

  -Z             Applies gzip content-encoding to file uploads. This option
                 works like the -z option described above, but it applies to
                 all uploaded files, regardless of extension.

                 Warning: If you use this option and some of the source files
                 don't compress well (e.g., that's often true of binary data),
                 this option may result in files taking up more space in the
                 cloud than they would if left uncompressed.

In [41]:
#let's copy Formentera.JPG there
!gsutil cp "Formentera.JPG" gs://a-brand-new-bucket-toto/

Copying file://Formentera.JPG [Content-Type=image/jpeg]...
| [1 files][  2.2 MiB/  2.2 MiB]                                                
Operation completed over 1 objects/2.2 MiB.                                      

In [42]:
#checking the result
!gsutil ls gs://a-brand-new-bucket-toto/


In [44]:
#the file is there let's have a look
!gsutil ls -l gs://a-brand-new-bucket-toto/Formentera.JPG

   2275011  2017-06-30T20:48:39Z  gs://a-brand-new-bucket-toto/Formentera.JPG
TOTAL: 1 objects, 2275011 bytes (2.17 MiB)

In [ ]:
#with more details
!gsutil ls -L gs://a-brand-new-bucket-toto/Formentera.JPG

In [45]:
#let's make it public!
!gsutil acl ch -u AllUsers:R gs://a-brand-new-bucket-toto/Formentera.JPG

Updated ACL on gs://a-brand-new-bucket-toto/Formentera.JPG

The Photo is there on the internet!


In [46]:
#let's move it
!gsutil mv gs://a-brand-new-bucket-toto/Formentera.JPG gs://a-brand-new-bucket-toto/formentera.jpeg

Copying gs://a-brand-new-bucket-toto/Formentera.JPG [Content-Type=image/jpeg]...
Removing gs://a-brand-new-bucket-toto/Formentera.JPG...                         

Operation completed over 1 objects/2.2 MiB.                                      

In [47]:
#check the result
!gsutil ls gs://a-brand-new-bucket-toto


In [49]:
#delete the file
!gsutil rm gs://a-brand-new-bucket-toto/formentera.jpeg

Removing gs://a-brand-new-bucket-toto/formentera.jpeg...
/ [1 objects]                                                                   
Operation completed over 1 objects.                                              

In [51]:
#deleting the bucket
!gsutil rm -r gs://a-brand-new-bucket-toto

Removing gs://a-brand-new-bucket-toto/...

In [52]:
#check the result
!gsutil ls


In [ ]: