9.1  File Hashing

   BEGIN: RFCEDITOR REMOVE BEFORE PUBLISHING

   After some discussion of this at connectathon, I know of two uses for
   this feature, neither one of which the feature is entirely suited
   for:
   o  Checking that a file has been uploaded to the server correctly;
      some portion of the customers wanting this feature want it in a
      security sense, as part of proof the server has the file.
   o  Optimizing upload or download of the file; multiple hashes are
      performed on small pieces of the file and the results are used to
      determine what chunks of the file, if any, need to be transfered.
      This is similar to the way rsync works.

   I've seen both of these implemented.

   For the first case, the extension has several drawbacks, including:
   o  A FIPS implementation can't ship md5.
   o  MD5's security is potential weaker than other options.
   o  Being hard-coded to MD5 makes in impossible to adapt to future
      developments in the arena of MD5 compromises.

   For the second case, the extension has these drawbacks:
   o  MD5 is expensive (relative to other options.)
   o  The extension must be sent potentially thousands of times to
      retrieve the desired granularity of hashes.

   Therefore, for this draft, this section is marked experimental; I've
   included a second proposed extension.  Please post your thoughts on
   the mailing list.  (I did it this way just so I could get a draft out
   that I and my active co-author are happy with.

   In addition, implemenation experience has shown the quick check hash
   to not be useful.

   END: RFCEDITOR REMOVE BEFORE PUBLISHING

9.1.1  Checking File Contents: v5 extension

   This extension allows a client to easily check if a file (or portion
   thereof) that it already has matches what is on the server.

       byte   SSH_FXP_EXTENDED
       uint32 request-id
       string "md5-hash" / "md5-hash-handle"
       string filename [UTF-8] / file-handle
       uint64 start-offset
       uint64 length
       string quick-check-hash

   filename
      Used if "md5-hash" is specified; indicates the name of the file to
      use.  The hash will be of the file contents as it would appear on
      the wire if the file were opened with no special flags.

   file-handle
      Used if "md5-hash-handle" is specified; specifies a file handle to
      read the data from.  The handle MUST be a file handle, and
      ACE4_READ_DATA MUST have been included in the desired-access when
      the file was opened.

      If this file handle was opened in SSH_FXF_ACCESS_TEXT_MODE mode,
      the md5-hash must be made of the data as it would be sent on the
      wire.

   start-offset
      The starting offset of the data to hash.

   length
      The length of data to include in the hash.  If both start-offset
      and length are zero, the entire file should be included.

   quick-check-hash
      The hash over the first 2048 bytes of the data range as the client
      knows it, or the entire range, if it is less than 2048 bytes.
      This allows the server to quickly check if it is worth the
      resources to hash a big file.

      If this is a zero length string, the client does not have the
      data, and is requesting the hash for reasons other than comparing
      with a local file.  The server MAY return SSH_FX_OP_UNSUPPORTED in
      this case.


   The response is either a SSH_FXP_STATUS packet, indicating an error,
   or the following extended reply packet:

       byte   SSH_FXP_EXTENDED_REPLY
       uint32 request-id
       string "md5-hash"
       string hash

   If 'hash' is zero length, then the 'quick-check-hash' did not match,
   and no hash operation was preformed.  Otherwise, 'hash' contains the
   hash of the entire data range (including the first 2048 bytes that
   were included in the 'quick-check-hash'.)

9.1.2  Checking File Contents

   This extension allows a client to easily check if a file (or portion
   thereof) that it already has matches what is on the server.

       byte   SSH_FXP_EXTENDED
       uint32 request-id
       string "check-file-handle" / "check-file-name"
       string handle / name
       string hash-algorithm-list
       uint64 start-offset
       uint64 length
       uint32 block-size

   handle
      For "check-file-handle", 'handle' is an open file handle returned
      by SSH_FXP_OPEN.  If 'handle' is not a handle returned by
      SSH_FXP_OPEN, the server MUST return SSH_FX_INVALID_HANDLE.  If
      ACE4_READ_DATA was not included when the file was opened, the
      server MUST return STATUS_PERMISSION_DENIED.

      If this file handle was opened in SSH_FXF_ACCESS_TEXT_MODE mode,
      the check must be performed on the data as it would be sent on the
      wire.

   name
      For "check-file-name", 'name' is the path to the file to check.
      If 'check-file-name' is a directory, SSH_FX_FILE_IS_A_DIRECTORY
      SHOULD be returned.  If 'check-file-name' refers to a
      SSH_FILEXFER_TYPE_SYMLINK, the target should be opened.  The
      results are undefined file types other than
      SSH_FILEXFER_TYPE_REGULAR.

      The file MUST be opened without the SSH_FXF_ACCESS_TEXT_MODE
      access flag (in binary mode.)

   hash-algorithm-list
      A comma separated list of hash algorithms the client is willing to
      accept for this operation.  The server MUST pick the first hash on
      the list that it supports.

      Currently defined algorithms are "md5", "sha1", "sha224",
      "sha256", "sha384", "sha512", and "crc32".  Additional algorithms
      may be added by following the DNS extensibility naming convention
      outlined in [I-D.ietf-secsh-architecture].

      MD5 is described in [RFC1321].  SHA-1, SHA-224, SHA-256, SHA-384,
      and SHA-512 are decribed in [FIPS-180-2].  [ISO.3309.1991]
      describes crc32, and is the same algorithm used in [RFC1510]

   start-offset
      The starting offset of the data to include in the hash.

   length
      The length of data to include in the hash.  If length is zero, all
      the data from start-offset to the end-of-file should be included.

   block-size
      An independant hash MUST be computed over every block in the file.
      The size of blocks is specified by block-size.  The block-size
      MUST NOT be smaller than 256 bytes.  If the block-size is 0, then
      only one hash, over the entire range, MUST be made.


   The response is either a SSH_FXP_STATUS packet, indicating an error,
   or the following extended reply packet:

       byte   SSH_FXP_EXTENDED_REPLY
       uint32 request-id
       string "check-file"
       string hash-algo-used
       byte   hash[n][block-count]

   hash-algo-used
      The hash algorithm that was actually used.

   hash
      The computed hashes.  The hash algorithm used determines the size
      of n.  The number of block-size chunks of data in the file
      determines block-count.  The hashes are placed in the packet one
      after another, with no decoration.

      Note that if the length of the range is not an even multiple of
      block-size, the last hash will have been computed over only the
      remainder of the range instead of a full block.