[nfsv4] device IDs and stateids

Mike Eisler <email2mre-ietf@yahoo.com> Sat, 10 November 2007 22:17 UTC

Return-path: <nfsv4-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IqyeJ-0007ho-Jv; Sat, 10 Nov 2007 17:17:55 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IqyeI-0007hi-TT for nfsv4@ietf.org; Sat, 10 Nov 2007 17:17:55 -0500
Received: from web38108.mail.mud.yahoo.com ([209.191.124.135]) by chiedprmail1.ietf.org with smtp (Exim 4.43) id 1IqyeH-0001Gi-VI for nfsv4@ietf.org; Sat, 10 Nov 2007 17:17:54 -0500
Received: (qmail 48934 invoked by uid 60001); 10 Nov 2007 22:17:53 -0000
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=V7qy0iGeQ5oJoLNPpHkBsHQ4d9Cx2zc4X+N2/op9GIRm/WbM9HlbrUy/NxI8YQyq66gfRijPltC6cu7aXbo2624qUME2OcJWfRuJ45Dtkx9MGq/XE8GFlKmuggd9jwv/h/lNXf0VHfLEDExXDRVKXSyYrnLWiUEMdtekz0/keUE=;
X-YMail-OSG: s3TnqAQVM1n962ZoZftLYDRILPR4YB_1IReTSJOcC0xRM.td82iQpbXjZsnNXaV_JiQNqC5HRw--
Received: from [198.95.226.230] by web38108.mail.mud.yahoo.com via HTTP; Sat, 10 Nov 2007 14:17:53 PST
Date: Sat, 10 Nov 2007 14:17:53 -0800
From: Mike Eisler <email2mre-ietf@yahoo.com>
To: nfsv4@ietf.org
In-Reply-To: <C98692FD98048C41885E0B0FACD9DFB80567E79C@exnane01.hq.netapp.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
Message-ID: <261722.48676.qm@web38108.mail.mud.yahoo.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b8f3559805f7873076212d6f63ee803e
Subject: [nfsv4] device IDs and stateids
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: email2mre-ietf@yahoo.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
Errors-To: nfsv4-bounces@ietf.org

I am unnerved by the seemingly unstoppable drive to having
zillions of stateids for device ID to device address
mappings, and still don't understand why we are going in
that direction. The latest I've read is that there will
be only one stateid per major device ID. Given that a major
device ID has ballooned to 64 bits, that does not comfort
me. The current path means that device ID notifications
have little, if any, utility, given that delegations for
2^64 different major device IDs can be handed out.

What are the requirements? No one has put them down in a
concise list list as far as I know. So I will state them,
based on what I've read and heard since the we had a pNFS
inspection meeting during the Austin bake-a-thon.

Requirements:

- Server

-- Without forcing the client to re-issue GETDEVICELIST,

--- be able to invalidate a single device ID's
    address mappings

--- Be able to change a single device ID's
    address mappings

--- Be able to change add a single device ID's
    address mappings

-- Allow recall of layouts by FSID, and possibly other
   coarse grained objects. My understanding is that for
   some layout types and/or pNFS server implementations
   this is critical.

- Client

-- Do not burden the general pNFS implementation
   with aspects that specific to a pNFS server or
   implementation (e.g. the possibility of recall by
   FSID requires clients to organize layouts by FSID,
   which is really a kick in the teeth if the server
   never issues such a recall.

-- If there must be a recall by a coarse grained object,
   don't force the client implementation to understand what
   that object is. Every time we add a layout type or
   a pNFS server implementation should not be an occasion
   to re-write client's generic pNFS layer.

- Both client and server

-- Minimize the cost implementing these requirements.
   E.g. do not go down the road of requiring multiple
   stateids for devices.

So here is the outline of what I've proposed before,
in more detail this time:

- DeviceIDs have a major and minor component. The current
  consensus that they be the same size and format as the
  fsid makes sense, since it makes it easier for servers
  to implement recall by fsid.

- Replace layout recall by FSID with layout recall by major
  device ID. Now the server can organize its data how it
  sees fit without giving the client information it doesn't
  need.

- Allow servers to indicate that they don't require
  clients to organize layouts by major device ID by
  encoding the major device ID such that if

  	did_major & 0x8000000000000000 is FALSE

  then this is an indication the device ID is to be
  treated as a flat object and no recall by major device
  ID will occur.

- There is no more than one device stateid per
  clientID/layout type pair. It is returned by
  GETDEVICEINFO, and GETDEVICELIST.

- enhance GETDEVICELIST to optionally take a major device
  ID. This will allow the client to obtain all the minor
  device IDs for a major device ID as well as the address
  mappings. This is actually the only new thing I've added.

- The preferred way device ID mappings are updated is through
  notification by CB_NOTIFY (or the server can recall
  the device stateid, and force the client to re-issue 
  GETDEVICELIST/GETDEVICEINFO for all mappings). After
  CB_NOTIFY is received, the client can issue GETDEVICELIST
  or GETDEVICEINFO as apporpriate.
  CB_NOTIFY has x deviceID notifications:

-- NOTIFY4_DEVICE_NEW_MAJOR: a new major device ID has been
   added. If the client cares, it can issue GETDEVICELIST
   to obtain it.

-- NOTIFY4_DEVICE_ADDED_TO_MAJOR: one or more minor
   devices added to the major device. If the client cares,
   it can issue GETDEVICELIST to get the current minor
   devices for the major device.

-- NOTIFY4_DEVICE_MAJOR_DELETED: some or all device IDs
   with the specified major device ID are deleted. If
   just some device IDs are deleted, the client can use
   GETDEVICELIST with the major device ID to see what
   is left.

-- NOTIFY4_DEVICE_ID_ADD - full deviceID, as in draft-15
-- NOTIFY4_DEVICE_ID_CHANGE - full deviceID, as in draft-15
-- NOTIFY4_DEVICE_ID_DELETE - full deviceID, as in draft-15
  
  These notifications are asynchronous. It is up to the
  pNFS server to prevent the use of deleted or updated
  device IDs.  We have at our disposal:

  - fencing (preferred)
  - the SLA method that the blocks layout type
    OPTIONALly specifies (if necessary)

  For those who argue that recall by a per major-device
  ID is safer, I see the same race conditions, and the
  same solutions: fencing and SLA.

--- "Noveck, Dave" <Dave.Noveck@netapp.com> wrote:

> > However, I see that I was confused. I didn't really mean a recall
> of 
> > the fsid, but of the deviceID. Are we, or are we not going to
> support 
> > recalling deviceIDs?
> 
> I can't predict the future.  The only thing I can do (barely) is to 
> read the current version of the spec based on the CVS repository and
> in there I don't see any provision for recalling a specific device
> (and somehow implicitly recalling all layout segment for that device
> which it seems like you are depending on).
> 
> > There was some discussion about doing this, since Marc in
> particular 
> > wanted the ability to invalidate a deviceID without recalling the
> layout 
> > (the client would presumably re-issue a GETDEVICEINFO request and
> get 
> > directed to a new DS) but I've lost track of the status of that
> request.
> 
> I think that that is the notify.
> 
> Device maps can also be recalled but there you are recalling all
> device
> with a given major id which may be too coarse-grained for what you
> want
> (or you may be able to arrange things so it isn't).  The effect on 
> existing layouts (whether there is an implicit recall) is not clear
> at
> all.  One problem is that although by giving device maps stateids, we
> have implicitly provided a recall mechanism, the text for CB_RECALL 
> only mentions recall of delegations, and so the status and semantics
> of recalls of device maps isn't very clear.
> 
> I'm also having trouble with this text:
> 
>    Device ID mappings represent another form of stateid
>    Section 8.2.1.  The GETDEVICEINFO and GETDEVICELIST operations
> each
>    return a device stateid. 
> 
> My assumption is one stateid per major device. 
> 
>    Like file delegations, the device stateid
>    is recallable.  A recall of the device stateid will remove or
>    invalidate the device ID mappings 
> 
> are "remove" and "invalidate" intended to be synonyms or are these
> two
> different things which can be done?
> 
>    as well as lease expiration.  
> 
> Huh?
> 
>    The
>    GETDEVICEINFO and GETDEVICELIST operations update the current
>    filehandle to facilitate the recall of the device stateid. 
> 
> I've already asked about this with regard to text for the ops
> GETDEVICEINFO and GETDEVICELIST.  I still have no clue what is
> intended.
> 
> I'm not sure but if look to me like to do what you want the the MDS 
> would have to do layout recalls on each of the individual stripes 
> that the device covered.
>

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4