From nfsv4-bounces@ietf.org Mon May 02 17:08:57 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DSi9x-0005jC-Lr; Mon, 02 May 2005 17:08:57 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DSi9w-0005Vz-IR
	for nfsv4@megatron.ietf.org; Mon, 02 May 2005 17:08:56 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA15324
	for <nfsv4@ietf.org>; Mon, 2 May 2005 17:08:54 -0400 (EDT)
Received: from citi.umich.edu ([141.211.133.111])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DSiNg-0002Pp-Az
	for nfsv4@ietf.org; Mon, 02 May 2005 17:23:09 -0400
Received: from citi.umich.edu (citi.umich.edu [141.211.133.111])
	by citi.umich.edu (Postfix) with ESMTP id A2F111BBAA;
	Mon,  2 May 2005 17:08:49 -0400 (EDT)
X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.1
To: nfsv4@ietf.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Mon, 02 May 2005 17:08:49 -0400
From: "William A.(Andy) Adamson" <andros@citi.umich.edu>
Message-Id: <20050502210849.A2F111BBAA@citi.umich.edu>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 7d33c50f3756db14428398e2bdedd581
Cc: Michael.Eisler@netapp.com
Subject: [nfsv4] rfc2025 (SPKM) and Version 3 X.509
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Hello

Appendix B of rfc2025 states

   This appendix contains, for completeness, the relevant ASN.1 types
   imported from InformationFramework (1993), AuthenticationFramework
   (1993), and [PKCS3].

e.g. use these versions of the InformationFramework and 
AuthenticationFramework asn1 types. this means that SPKM uses only version 1 
or version 2 x.509 certificates - which do not contain any extensions. this is 
problematic - everyone uses vesion 3 x.509 certs - (UMICH, the Grid...)

what, if anything, needs to officially be done to allow SPKM to use version 3 
certificates?

thanks

-->Andy


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 02 17:30:51 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DSiV9-0007fb-Js; Mon, 02 May 2005 17:30:51 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DSiV7-0007fU-NA
	for nfsv4@megatron.ietf.org; Mon, 02 May 2005 17:30:49 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA17375
	for <nfsv4@ietf.org>; Mon, 2 May 2005 17:30:45 -0400 (EDT)
Received: from nwkea-mail-2.sun.com ([192.18.42.14])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DSiip-0002wd-Ef
	for nfsv4@ietf.org; Mon, 02 May 2005 17:45:00 -0400
Received: from centralmail1brm.Central.Sun.COM ([129.147.62.1])
	by nwkea-mail-2.sun.com (8.12.10/8.12.9) with ESMTP id j42LUiQ1014896
	for <nfsv4@ietf.org>; Mon, 2 May 2005 14:30:44 -0700 (PDT)
Received: from binky.Central.Sun.COM (binky.Central.Sun.COM [129.153.128.104])
	by centralmail1brm.Central.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,
	v2.2) with ESMTP id j42LUhac001881
	for <nfsv4@ietf.org>; Mon, 2 May 2005 15:30:44 -0600 (MDT)
Received: from binky.Central.Sun.COM (localhost [127.0.0.1])
	by binky.Central.Sun.COM (8.13.3+Sun/8.13.3) with ESMTP id
	j42LUZhg028680; Mon, 2 May 2005 16:30:35 -0500 (CDT)
Received: (from nw141292@localhost)
	by binky.Central.Sun.COM (8.13.3+Sun/8.13.3/Submit) id j42LUU77028679; 
	Mon, 2 May 2005 16:30:30 -0500 (CDT)
Date: Mon, 2 May 2005 16:30:30 -0500
From: Nicolas Williams <Nicolas.Williams@sun.com>
To: "William A.(Andy) Adamson" <andros@citi.umich.edu>
Subject: Re: [nfsv4] rfc2025 (SPKM) and Version 3 X.509
Message-ID: <20050502213030.GO28366@binky.Central.Sun.COM>
Mail-Followup-To: "William A.(Andy) Adamson" <andros@citi.umich.edu>,
	nfsv4@ietf.org, Michael.Eisler@netapp.com
References: <20050502210849.A2F111BBAA@citi.umich.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20050502210849.A2F111BBAA@citi.umich.edu>
User-Agent: Mutt/1.5.7i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 856eb5f76e7a34990d1d457d8e8e5b7f
Cc: Michael.Eisler@netapp.com, nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Mon, May 02, 2005 at 05:08:49PM -0400, William A.(Andy) Adamson wrote:
> Hello
> 
> Appendix B of rfc2025 states
> 
>    This appendix contains, for completeness, the relevant ASN.1 types
>    imported from InformationFramework (1993), AuthenticationFramework
>    (1993), and [PKCS3].
> 
> e.g. use these versions of the InformationFramework and 
> AuthenticationFramework asn1 types. this means that SPKM uses only version 1 
> or version 2 x.509 certificates - which do not contain any extensions. this is 
> problematic - everyone uses vesion 3 x.509 certs - (UMICH, the Grid...)
> 
> what, if anything, needs to officially be done to allow SPKM to use version 3 
> certificates?

Deprecate SPKM and replace it with a new mechanism based on DTLS :)

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 03 14:28:55 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DT28d-0005RD-Oe; Tue, 03 May 2005 14:28:55 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DT28c-0005R8-Bt
	for nfsv4@megatron.ietf.org; Tue, 03 May 2005 14:28:54 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA18623
	for <nfsv4@ietf.org>; Tue, 3 May 2005 14:28:52 -0400 (EDT)
Received: from mx2.netapp.com ([216.240.18.37])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DT2MX-0000oM-Az
	for nfsv4@ietf.org; Tue, 03 May 2005 14:43:18 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124)
	by mx2.netapp.com with ESMTP; 03 May 2005 11:28:43 -0700
X-IronPort-AV: i="3.92,151,1112598000"; 
	d="scan'208"; a="210156976:sNHT16868040"
Received: from svlexc02.hq.netapp.com (svlexc02.corp.netapp.com
	[10.57.157.136])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j43ISgX5023596
	for <nfsv4@ietf.org>; Tue, 3 May 2005 11:28:42 -0700 (PDT)
Received: from lavender.hq.netapp.com ([10.56.11.75]) by
	svlexc02.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); 
	Tue, 3 May 2005 11:28:42 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.0.6603.0
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 3 May 2005 11:28:42 -0700
Message-ID: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
Thread-Topic: Returning NFS4ERR_CB_PATH_DOWN 
Thread-Index: AcVQDe7sYhHIs7RNS4+4pZ2A+SZQSQ==
From: "Khan, Saadia" <Saadia.Khan@netapp.com>
To: <nfsv4@ietf.org>
X-OriginalArrivalTime: 03 May 2005 18:28:42.0587 (UTC)
	FILETIME=[EF153AB0:01C5500D]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 0a7aa2e6e558383d84476dc338324fab
Content-Transfer-Encoding: quoted-printable
Subject: [nfsv4] Returning NFS4ERR_CB_PATH_DOWN 
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

  =20
This is what the spec has to say about NFS4ERR_CB_PATH_DOWN:

   When the client holds delegations, it needs to use RENEW to detect
   when the server has determined that the callback path is down.  When
   the server has made such a determination, only the RENEW operation
   will renew the lease on delegations.  If the server determines the
   callback path is down, it returns NFS4ERR_CB_PATH_DOWN.  Even though
   it returns NFS4ERR_CB_PATH_DOWN, the server MUST renew the lease on
   the record locks and share reservations that the client has
   established on the server.  If for some reason the lock and share
   reservation lease cannot be renewed, then the server MUST return an
   error other than NFS4ERR_CB_PATH_DOWN, even if the callback path is
   also down.

and the more confusing part is:

This difficulty is solved by the following rules:

   o  When the callback path is down, the server MUST NOT revoke the
      delegation if one of the following occurs:

      -  The client has issued a RENEW operation and the server has
         returned an NFS4ERR_CB_PATH_DOWN error.  The server MUST renew
         the lease for any record locks and share reservations the
         client has that the server has known about (as opposed to those
         locks and share reservations the client has established but not
         yet sent to the server, due to the delegation).  The server
         SHOULD give the client a reasonable time to return its
         delegations to the server before revoking the client's
         delegations.

      -  The client has not issued a RENEW operation for some period of
         time after the server attempted to recall the delegation.  This
         period of time MUST NOT be less than the value of the
         lease_time attribute.

   o  When the client holds a delegation, it can not rely on operations,
      except for RENEW, that take a stateid, to renew delegation leases
      across callback path failures.  The client that wants to keep
      delegations in force across callback path failures must use RENEW
      to do so.


so it seems like just because the callback path is down, the server
should not
revoke delegations, so in case the server finds out that the callback
path is down because it is in the middle of recalling the delegation, it
probably needs to wait for=20
a renew from that client so that it can send it NFS4ERR_CB_PATH_DOWN and
then=20
wait for atleast one lease period before revoking the delegation.=20

Does this seem like the right approach? Are there other server
implementations which have implemented this approach and do current
clients correctly handle this error case?

Thanks.
Saadia

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 03 15:13:02 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DT2pK-0002Sd-Mc; Tue, 03 May 2005 15:13:02 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DT2pI-0002SY-U8
	for nfsv4@megatron.ietf.org; Tue, 03 May 2005 15:13:00 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA23066
	for <nfsv4@ietf.org>; Tue, 3 May 2005 15:12:58 -0400 (EDT)
Received: from dsl093-002-214.det1.dsl.speakeasy.net ([66.93.2.214]
	helo=pickle.fieldses.org) by ietf-mx.ietf.org with esmtp (Exim 4.33)
	id 1DT33D-0001ra-Rz
	for nfsv4@ietf.org; Tue, 03 May 2005 15:27:25 -0400
Received: from bfields by pickle.fieldses.org with local (Exim 4.50)
	id 1DT2p2-0006qW-Sz; Tue, 03 May 2005 15:12:44 -0400
Date: Tue, 3 May 2005 15:12:44 -0400
To: "Khan, Saadia" <Saadia.Khan@netapp.com>
Subject: Re: [nfsv4] Returning NFS4ERR_CB_PATH_DOWN
Message-ID: <20050503191244.GA26116@fieldses.org>
References: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
User-Agent: Mutt/1.5.9i
From: "J. Bruce Fields" <bfields@fieldses.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 8b30eb7682a596edff707698f4a80f7d
Cc: nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Tue, May 03, 2005 at 11:28:42AM -0700, Khan, Saadia wrote:
> so it seems like just because the callback path is down, the server
> should not revoke delegations, so in case the server finds out that
> the callback path is down because it is in the middle of recalling the
> delegation, it probably needs to wait for a renew from that client so
> that it can send it NFS4ERR_CB_PATH_DOWN and then wait for atleast one
> lease period before revoking the delegation. 

That's how I read it.  With one exception: the "at least one lease
period" requirement is only for the length of time the server has to
wait for the client to send the renew that the server replies
CB_PATH_DOWN to:

>	The server SHOULD give the client a reasonable time to return
>	its delegations to the server before revoking the client's
>	delegations.

Once it's gotten the chance to return CB_PATH_DOWN, the time to wait is
just described as "a reasonable period":

>       -  The client has not issued a RENEW operation for some period of
>          time after the server attempted to recall the delegation.  This
>          period of time MUST NOT be less than the value of the
>          lease_time attribute.

> Are there other server implementations which have implemented this
> approach

Linux returns CB_PATH_DOWN, but in some situations I think it may not
wait the correct amount of time before returning the delegation--but
that's a bug I need to fix.

> and do current clients correctly handle this error case?

The Linux client has code to handle this case, which looks like it does
exactly the right thing.  I don't know how much it's actually been
tested.

--b.

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 03 15:38:57 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DT3EP-0002BA-4t; Tue, 03 May 2005 15:38:57 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DT3EN-0002B5-Gu
	for nfsv4@megatron.ietf.org; Tue, 03 May 2005 15:38:55 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA25974
	for <nfsv4@ietf.org>; Tue, 3 May 2005 15:38:53 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from dargo.cs.uoguelph.ca ([131.104.96.159])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DT3SJ-0002ai-3m
	for nfsv4@ietf.org; Tue, 03 May 2005 15:53:20 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca
	[131.104.48.1])
	by dargo.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id j43Jcq0A021211
	for <nfsv4@ietf.org>; Tue, 3 May 2005 15:38:52 -0400
Received: (from rick@localhost)
	by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id PAA51413
	for nfsv4@ietf.org; Tue, 3 May 2005 15:39:43 -0400 (EDT)
Date: Tue, 3 May 2005 15:39:43 -0400 (EDT)
Message-Id: <200505031939.PAA51413@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Scanned-By: MIMEDefang 2.44
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 789c141a303c09204b537a4078e2a63f
Subject: [nfsv4] re: delegation recall with CB path down
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> so it seems like just because the callback path is down, the server
> should not
> revoke delegations, so in case the server finds out that the callback
> path is down because it is in the middle of recalling the delegation, it
> probably needs to wait for 
> a renew from that client so that it can send it NFS4ERR_CB_PATH_DOWN and
> then 
> wait for atleast one lease period before revoking the delegation. 
> 
> Does this seem like the right approach? Are there other server
> implementations which have implemented this approach and do current
> clients correctly handle this error case?

This is essentially what my server does. After issuing the Recall, it
sets a timeout of "2 * lease + delta" before revocation, if the recall
failed. Since the failed recall will have marked the callback path down,
it will reply NFS4ERR_CB_PATH_DOWN to any Renew during that time.
--> The client should have somewhat more than one lease duration after
    receiving the NFS4ERR_CB_PATH_DOWN to return outstanding delegations.

I've attached my code, in case the comments are of interest to anyone, rick
--- bsd delegation recall code ---
	/*
	 * If the conflict is with an old delegation...
	 */
	if (stp->ls_flags & NFSLCK_OLDDELEG) {
		/*
		 * You can delete it, if it has expired.
		 */
		if (clp->lc_delegtime < NFSD_MONOSEC) {
			s = splsoftclock();
			nfsrv_freedeleg(stp);
			splx(s);
			return (0);
		}
		/*
		 * During this delay, the old delegation could expire or it
		 * could be recovered by the client via an Open with
		 * CLAIM_DELEGATE_PREV.
		 * Release the nfsv4root_lock, if held.
		 */
		if (*haslockp) {
			*haslockp = 0;
			nfsrv_v4rootunlock(1);
		}
		return (NFSERR_DELAY);
	}

	/*
	 * It's a current delegation, so:
	 * - check to see if the delegation has expired
	 *   - if so, get the v4root lock and then expire it
	 */
	if (!(stp->ls_flags & NFSLCK_DELEGRECALL)) {
		if (*haslockp) {
			*haslockp = 0;
			nfsrv_v4rootunlock(1);
		}
		/*
		 * - do a recall callback, since not yet done
		 * For now, never allow truncate to be set. To use
		 * truncate safely, it must be guaranteed that the
		 * Remove, Rename or Setattr with size of 0 will
		 * succeed and that would require major changes to
		 * the VFS/Vnode OPs.
		 * Set the expiry time large enough so that it won't expire
		 * until after the callback, then set it correctly, once
		 * the callback is done. (The delegation will now time
		 * out whether or not the Recall worked ok, but with a
		 * larger timeout if it succeeded.)
		 */
		stp->ls_delegtime = NFSD_MONOSEC + 240;
		stp->ls_flags |= NFSLCK_DELEGRECALL;

		/*
		 * Loop NFSRV_CBRETRYCNT times while the CBRecall replies
		 * NFSERR_BADSTATEID or NFSERR_BADHANDLE. This is done
		 * in order to try and avoid a race that could happen
		 * when a CBRecall request passed the Open reply with
		 * the delegation in it when transitting the network.
		 */
		retrycnt = 0;
		do {
		    error = nfsrv_docallback(clp, NFSV4OP_CBRECALL,
			&stp->ls_stateid,0,&stp->ls_lfp->lf_fh,NULL,NULL,p);
		    retrycnt++;
		} while ((error == NFSERR_BADSTATEID ||
		    error == NFSERR_BADHANDLE) && retrycnt < NFSV4_CBRETRYCNT);
		if (error)
			stp->ls_delegtime = NFSD_MONOSEC +
				(2 * nfsrv_lease) + NFSRV_LEASEDELTA;
		else
			stp->ls_delegtime = NFSD_MONOSEC +
				(5 * nfsrv_lease) + NFSRV_LEASEDELTA;
		return (NFSERR_DELAY);
	}

	if (clp->lc_expiry >= NFSD_MONOSEC &&
	    stp->ls_delegtime >= NFSD_MONOSEC) {
		/*
		 * A recall has been done, but it has not yet expired.
		 * So, RETURN_DELAY.
		 */
		if (*haslockp) {
			*haslockp = 0;
			nfsrv_v4rootunlock(1);
		}
		return (NFSERR_DELAY);
	}

	/*
	 * If we don't yet have the lock, just get it and then return,
	 * since we need that before deleting expired state, such as
	 * this delegation.
	 * When getting the lock, unlock the vnode, so other nfsds that
	 * are in progress, won't get stuck waiting for the vnode lock.
	 */
	if (*haslockp == 0) {
		nfsrv_v4rootrelref();
		VOP_UNLOCK(vp, 0, p);
		do {
			gotlock = nfsrv_v4rootlock(1);
		} while (!gotlock);
		*haslockp = 1;
		vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, p);
		return (-1);
	}

	/*
	 * Ok, we can delete the expired delegation.
	 * First, write the Revoke record to stable storage and then
	 * clear out the conflict.
	 */
	nfsrv_writestable(clp->lc_id, clp->lc_idlen, NFSNST_REVOKE, p);
	s = splsoftclock();
	if (clp->lc_expiry < NFSD_MONOSEC) {
		nfsrv_cleanclient(clp);
		nfsrv_freedeleglist(&clp->lc_deleg);
		nfsrv_freedeleglist(&clp->lc_olddeleg);
		LIST_REMOVE(clp, lc_hash);
		zapped_clp = 1;
	} else {
		nfsrv_freedeleg(stp);
		zapped_clp = 0;
	}
	splx(s);
	if (zapped_clp)
		nfsrv_zapclient(clp, p);

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 03 18:58:18 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DT6LK-0001Gh-1N; Tue, 03 May 2005 18:58:18 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DT6LI-0001EX-Ag
	for nfsv4@megatron.ietf.org; Tue, 03 May 2005 18:58:16 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id SAA20998
	for <nfsv4@ietf.org>; Tue, 3 May 2005 18:58:11 -0400 (EDT)
Received: from brmea-mail-3.sun.com ([192.18.98.34])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DT6ZB-0002Ed-Ah
	for nfsv4@ietf.org; Tue, 03 May 2005 19:12:40 -0400
Received: from sfbaymail1sca.SFBay.Sun.COM ([129.145.154.35])
	by brmea-mail-3.sun.com (8.12.10/8.12.9) with ESMTP id j43Mw9jO028066; 
	Tue, 3 May 2005 16:58:09 -0600 (MDT)
Received: from sheplap.Central.Sun.COM (sheplap.Central.Sun.COM [10.1.194.251])
	by sfbaymail1sca.SFBay.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,v2.2) with
	ESMTP id j43Mw8jg015643; Tue, 3 May 2005 15:58:09 -0700 (PDT)
Received: by sheplap.Central.Sun.COM (Postfix, from userid 76367)
	id AD5FA3A7920; Tue,  3 May 2005 17:58:08 -0500 (CDT)
Date: Tue, 3 May 2005 17:58:08 -0500
From: Spencer Shepler <spencer.shepler@sun.com>
To: "Khan, Saadia" <Saadia.Khan@netapp.com>
Subject: Re: [nfsv4] Returning NFS4ERR_CB_PATH_DOWN
Message-ID: <20050503225808.GA1209@sheplap.Central.Sun.COM>
References: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
User-Agent: Mutt/1.4.1i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 50a516d93fd399dc60588708fd9a3002
Cc: nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: spencer.shepler@sun.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Tue, Khan, Saadia wrote:

<...>

> 
> 
> so it seems like just because the callback path is down, the server
> should not
> revoke delegations, so in case the server finds out that the callback
> path is down because it is in the middle of recalling the delegation, it
> probably needs to wait for 
> a renew from that client so that it can send it NFS4ERR_CB_PATH_DOWN and
> then 
> wait for atleast one lease period before revoking the delegation. 
> 
> Does this seem like the right approach? Are there other server
> implementations which have implemented this approach and do current
> clients correctly handle this error case?

Yes, this is appropriate.

The Solaris client, upon receipt of NFS4ERR_CB_PATH_DOWN, will return
all of its delegations.

The Solaris server will do the CB_PATH_DOWN notification but there may
be one bug lurking in there someplace.


I will take the opportunity to close a thread of discussion we had
awhile back that is similar to this.

It was the race condition that the client and server have when the
client is provided a delegation for OPEN and the OPEN response is
on its way back to the client and the server needs to recall the
delegation and the deleg recall may make it to the client first.
The client will likely return an error to the CB_RECALL and we
discussed the various ways to fix this and I believe came to consensus
on the following.

If the server receives an error, like NFS4ERR_BADHANDLE, in response
to CB_RECALL, it should not immediately revoke the delegation but
retry the CB_RECALL within a lease period.  Once a lease period has
passed from the initial CB_RECALL, the server will then revoke the
delegation.

The Solaris server doesn't do this currently but I will be making
a change soon to correct this.

Spencer

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 03 18:59:41 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DT6Mf-0001eR-AU; Tue, 03 May 2005 18:59:41 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DT6Md-0001eG-Qo
	for nfsv4@megatron.ietf.org; Tue, 03 May 2005 18:59:40 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id SAA21145
	for <nfsv4@ietf.org>; Tue, 3 May 2005 18:59:36 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DT6ab-0002Hh-6A
	for nfsv4@ietf.org; Tue, 03 May 2005 19:14:06 -0400
Received: from sfbaymail2sca.sfbay.sun.com ([129.145.155.42])
	by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j43MxZi7018084; 
	Tue, 3 May 2005 16:59:35 -0600 (MDT)
Received: from sheplap.Central.Sun.COM (sheplap.Central.Sun.COM [10.1.194.251])
	by sfbaymail2sca.sfbay.sun.com (8.12.10+Sun/8.12.10/ENSMAIL,v2.2) with
	ESMTP id j43MxZAH016885; Tue, 3 May 2005 15:59:35 -0700 (PDT)
Received: by sheplap.Central.Sun.COM (Postfix, from userid 76367)
	id E4CF33A7932; Tue,  3 May 2005 17:59:34 -0500 (CDT)
Date: Tue, 3 May 2005 17:59:34 -0500
From: Spencer Shepler <spencer.shepler@sun.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: [nfsv4] Returning NFS4ERR_CB_PATH_DOWN
Message-ID: <20050503225934.GB1209@sheplap.Central.Sun.COM>
References: <482A3FA0050D21419C269D13989C611303D87BE1@lavender-fe.eng.netapp.com>
	<20050503191244.GA26116@fieldses.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20050503191244.GA26116@fieldses.org>
User-Agent: Mutt/1.4.1i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 2409bba43e9c8d580670fda8b695204a
Cc: "Khan, Saadia" <Saadia.Khan@netapp.com>, nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: spencer.shepler@sun.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Tue, J. Bruce Fields wrote:
> On Tue, May 03, 2005 at 11:28:42AM -0700, Khan, Saadia wrote:
> > so it seems like just because the callback path is down, the server
> > should not revoke delegations, so in case the server finds out that
> > the callback path is down because it is in the middle of recalling the
> > delegation, it probably needs to wait for a renew from that client so
> > that it can send it NFS4ERR_CB_PATH_DOWN and then wait for atleast one
> > lease period before revoking the delegation. 
> 
> That's how I read it.  With one exception: the "at least one lease
> period" requirement is only for the length of time the server has to
> wait for the client to send the renew that the server replies
> CB_PATH_DOWN to:
> 
> >	The server SHOULD give the client a reasonable time to return
> >	its delegations to the server before revoking the client's
> >	delegations.
> 
> Once it's gotten the chance to return CB_PATH_DOWN, the time to wait is
> just described as "a reasonable period":

I agree.  The server can extend the period if it observes the client
making a reasonable effort to return delegations (just as it would if
a CB_RECALL has been serviced and there are WRITEs being done for the
file).

Spencer

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 06 15:44:30 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DU8kQ-0003PF-EL; Fri, 06 May 2005 15:44:30 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DU8kM-0003OM-F4; Fri, 06 May 2005 15:44:26 -0400
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA29069;
	Fri, 6 May 2005 15:44:24 -0400 (EDT)
Message-Id: <200505061944.PAA29069@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: i-d-announce@ietf.org
From: Internet-Drafts@ietf.org
Date: Fri, 06 May 2005 15:44:24 -0400
Cc: nfsv4@ietf.org
Subject: [nfsv4] I-D ACTION:draft-ietf-nfsv4-rfc1832bis-06.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Network File System Version 4 Working Group of the IETF.

	Title		: XDR: External Data Representation Standard
	Author(s)	: M. Eisler
	Filename	: draft-ietf-nfsv4-rfc1832bis-06.txt
	Pages		: 26
	Date		: 2005-5-6
	
This document describes the External Data Representation Standard
   (XDR) protocol as it is currently deployed and accepted.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-rfc1832bis-06.txt

To remove yourself from the I-D Announcement list, send a message to 
i-d-announce-request@ietf.org with the word unsubscribe in the body of the message.  
You can also visit https://www1.ietf.org/mailman/listinfo/I-D-announce 
to change your subscription settings.


Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-nfsv4-rfc1832bis-06.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-nfsv4-rfc1832bis-06.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body; access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID: <2005-5-6153243.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-nfsv4-rfc1832bis-06.txt

--OtherAccess
Content-Type: Message/External-body; name="draft-ietf-nfsv4-rfc1832bis-06.txt";
	site="ftp.ietf.org"; access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID: <2005-5-6153243.I-D@ietf.org>


--OtherAccess--

--NextPart
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

--NextPart--


From nfsv4-bounces@ietf.org Fri May 06 16:14:59 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DU9Dv-0008Jt-AI; Fri, 06 May 2005 16:14:59 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DU9Dt-0008Jo-Pg
	for nfsv4@megatron.ietf.org; Fri, 06 May 2005 16:14:57 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA07127
	for <nfsv4@ietf.org>; Fri, 6 May 2005 16:14:55 -0400 (EDT)
Received: from eagle.sharedhosting.net ([206.127.192.10])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DU9SR-0008WR-RA
	for nfsv4@ietf.org; Fri, 06 May 2005 16:30:01 -0400
Received: from eisler.com (localhost [127.0.0.1])
	by eagle.sharedhosting.net (8.12.10/8.12.10) with SMTP id
	j46KElvt002729
	for nfsv4@ietf.org; Fri, 6 May 2005 13:14:47 -0700 (PDT)
Date: Fri, 6 May 2005 13:14:47 -0700 (PDT)
Message-Id: <200505062014.j46KElvt002729@eagle.sharedhosting.net>
To: nfsv4@ietf.org
Subject: RE: [nfsv4] I-D ACTION:draft-ietf-nfsv4-rfc1832bis-06.txt
From: Mike Eisler <mike@eisler.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 0bc60ec82efc80c84b8d02f4b0e4de22
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> From: Internet-Drafts@ietf.org [mailto:Internet-Drafts@ietf.org] 
[...]
> 
> A New Internet-Draft is available from the on-line 
> Internet-Drafts directories.
> This draft is a work item of the Network File System Version 
> 4 Working Group of the IETF.
> 
> 	Title		: XDR: External Data Representation Standard
> 	Author(s)	: M. Eisler
> 	Filename	: draft-ietf-nfsv4-rfc1832bis-06.txt
> 	Pages		: 26
> 	Date		: 2005-5-6
> 	
> This document describes the External Data Representation Standard
>    (XDR) protocol as it is currently deployed and accepted.
> 
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-rfc1832bis-06.txt

That was fast. This revision is the result of Area
Director review. The feedback was to put real content
into the Security Considerations section.

If you want to see a version with change bars 
relative to -05, see:

  http://www.eisler.com/nfsv4-wg/draft-ietf-nfsv4-rfc1832bis-06.cb

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 12 22:14:32 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DWPhA-00081F-9b; Thu, 12 May 2005 22:14:32 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DWPh8-00081A-8M
	for nfsv4@megatron.ietf.org; Thu, 12 May 2005 22:14:30 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA05047
	for <nfsv4@ietf.org>; Thu, 12 May 2005 22:14:26 -0400 (EDT)
Received: from nwkea-mail-1.sun.com ([192.18.42.13])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DWPwo-0007I0-Jk
	for nfsv4@ietf.org; Thu, 12 May 2005 22:30:50 -0400
Received: from sfbaymail1sca.SFBay.Sun.COM ([129.145.154.35])
	by nwkea-mail-1.sun.com (8.12.10/8.12.9) with ESMTP id j4D2EIjG028197
	for <nfsv4@ietf.org>; Thu, 12 May 2005 19:14:18 -0700 (PDT)
Received: from sheplap.Central.Sun.COM (sheplap.Central.Sun.COM [10.1.194.251])
	by sfbaymail1sca.SFBay.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,
	v2.2) with ESMTP id j4D2EIjg015853
	for <nfsv4@ietf.org>; Thu, 12 May 2005 19:14:18 -0700 (PDT)
Received: by sheplap.Central.Sun.COM (Postfix, from userid 76367)
	id 8872E3B780A; Thu, 12 May 2005 21:14:25 -0500 (CDT)
Date: Thu, 12 May 2005 21:14:25 -0500
From: Spencer Shepler <spencer.shepler@sun.com>
To: nfsv4@ietf.org
Message-ID: <20050513021424.GP6189@sheplap.Central.Sun.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.1i
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 856eb5f76e7a34990d1d457d8e8e5b7f
Subject: [nfsv4] errata: sequence id handling with NFS4ERR_OLD_STATEID
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: spencer.shepler@sun.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org


In section 8.1.5, we have the following:

   The client MUST monotonically increment the sequence number for the
   CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
   operations.  This is true even in the event that the previous
   operation that used the sequence number received an error.  The only
   exception to this rule is if the previous operation received one of
   the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
   NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
   NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE.

It appears that NFS4ERR_OLD_STATEID is missing from the list of errors
that do not increment the sequence id.

Any objections to that interpretation?

Spencer

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 17 15:02:52 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DY7L9-00055q-UU; Tue, 17 May 2005 15:02:51 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DY7L9-00052g-Cf
	for nfsv4@megatron.ietf.org; Tue, 17 May 2005 15:02:51 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA26462
	for <nfsv4@ietf.org>; Tue, 17 May 2005 15:02:39 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from mailhub.cs.uoguelph.ca ([131.104.96.75])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DY7bm-0000V6-5A
	for nfsv4@ietf.org; Tue, 17 May 2005 15:20:03 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca
	[131.104.48.1])
	by mailhub.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id j4HJ2cnM027551
	for <nfsv4@ietf.org>; Tue, 17 May 2005 15:02:38 -0400
Received: (from rick@localhost)
	by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id PAA02874
	for nfsv4@ietf.org; Tue, 17 May 2005 15:03:15 -0400 (EDT)
Date: Tue, 17 May 2005 15:03:15 -0400 (EDT)
Message-Id: <200505171903.PAA02874@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Scanned-By: MIMEDefang 2.44
X-Spam-Score: 0.3 (/)
X-Scan-Signature: ffa9dfbbe7cc58b3fa6b8ae3e57b0aa3
Subject: [nfsv4] re: NFS4ERR_OLD_STATEID and seqid
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> In section 8.1.5, we have the following:
> 
>    The client MUST monotonically increment the sequence number for the
>    CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
>    operations.  This is true even in the event that the previous
>    operation that used the sequence number received an error.  The only
>    exception to this rule is if the previous operation received one of
>    the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
>    NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
>    NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE.
> 
> It appears that NFS4ERR_OLD_STATEID is missing from the list of errors
> that do not increment the sequence id.
> 
> Any objections to that interpretation?

My server currently does increment the seqid before returning
NFS4ERR_OLD_STATEID. I can easily change that (it actually allows me
to delete some code for that special case), but I'd like to hear that
everyone is agreed first and that NFS4ERR_OLD_STATEID is being added to the
list as an Errata change?

rick
ps: There are several other quirky cases I ran into, along with the above.
    You might find ftp://ftp.cis.uoguelph.ca/pub/nfsv4/Quirky.cases of
    interest.

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Wed May 18 12:31:15 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYRRz-0002xb-LQ; Wed, 18 May 2005 12:31:15 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYRRy-0002xW-EG
	for nfsv4@megatron.ietf.org; Wed, 18 May 2005 12:31:14 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA09544
	for <nfsv4@ietf.org>; Wed, 18 May 2005 12:31:09 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYRiu-00009L-FH
	for nfsv4@ietf.org; Wed, 18 May 2005 12:48:45 -0400
Received: from centralmail1brm.Central.Sun.COM ([129.147.62.1])
	by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j4IGV60R020571; 
	Wed, 18 May 2005 10:31:06 -0600 (MDT)
Received: from minas-tirith.Central.Sun.COM (minas-tirith.Central.Sun.COM
	[129.153.128.154])
	by centralmail1brm.Central.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,v2.2)
	with ESMTP id j4IGV2EN027678; Wed, 18 May 2005 10:31:02 -0600 (MDT)
Received: from minas-tirith.Central.Sun.COM (localhost [127.0.0.1])
	by minas-tirith.Central.Sun.COM (8.12.11+Sun/8.12.11) with ESMTP id
	j4IGWftg010777; Wed, 18 May 2005 11:32:41 -0500 (CDT)
Received: (from rmesta@localhost)
	by minas-tirith.Central.Sun.COM (8.12.11+Sun/8.12.11/Submit) id
	j4IGWdxZ010776; Wed, 18 May 2005 11:32:39 -0500 (CDT)
Date: Wed, 18 May 2005 11:32:39 -0500
From: Rick Mesta <Ricardo.Mesta@sun.com>
To: "Olaf M. Kolkman" <olaf@ripe.net>
Message-ID: <20050518163239.GA10343@minas-tirith.central.sun.com>
References: <20050509150200.28002bb0.olaf@ripe.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20050509150200.28002bb0.olaf@ripe.net>
User-Agent: Mutt/1.4.2.1i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 769a46790fb42fbb0b0cc700c82f7081
Cc: namedroppers@ops.ietf.org, okolkman@ripe.net, beepy@netapp.com,
	Spencer.Shepler@sun.com, Rick Mesta <Ricardo.Mesta@sun.com>,
	nfsv4@ietf.org, ogud@ogud.com
Subject: [nfsv4] Re: draft-mesta-nfs4id-dns-rr-01
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Rick Mesta <Ricardo.Mesta@sun.com>
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org


	Hi Olaf,

	 And thx for posting to the alias =:),

	<in-lined comments below...>

On Mon, May 09, 2005 at 03:02:00PM +0200, Olaf M. Kolkman wrote:
| 
| 
| Dear colleagues,
| 
| 
| Rick Mesta has recently posted the following I-D:
|  A DNS RR for NFSv4 ID Domains
|  draft-mesta-nfs4id-dns-rr-01
| 
| The document would benefit from some review by the DNS community.
| 
| I've had some private communication with the editor and one of the issues
| that came up during the conversation was that the NFS client is supposed
| to know its "default domain".
| 
| (The "default domain" being what e.g. the res_init() function would
| return).
| 
| AFAIK the "default domain" is not something that is part of the DNS
| specs but I am afraid I am overlooking something. What would be a
| proper specification or reference for the "default domain"?

	In case there is no bonafide DNS specification DNS for a
	"default domain", I wonder if having "default domain" be
	loosely defined as being an Operating Environment artifact
	(e.g. the configured /etc/resolv.conf domain, for UNIX type
	systems) would be good enough for a specification.

	Thoughts ?

		rick
| 
| Please try to CC the list on issues that are within scope and try to
| keep out-of-scope issues off list.
| 
| Thanks a lot.
| 
| -- Olaf
| 
-- 

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Wed May 18 20:33:55 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYYz5-0005mL-Pc; Wed, 18 May 2005 20:33:55 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYYz4-0005mB-AH
	for nfsv4@megatron.ietf.org; Wed, 18 May 2005 20:33:54 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA10671
	for <nfsv4@ietf.org>; Wed, 18 May 2005 20:33:50 -0400 (EDT)
Received: from brmea-mail-3.sun.com ([192.18.98.34])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYZG4-0007Fy-DV
	for nfsv4@ietf.org; Wed, 18 May 2005 20:51:30 -0400
Received: from sfbaymail1sca.SFBay.Sun.COM ([129.145.154.35])
	by brmea-mail-3.sun.com (8.12.10/8.12.9) with ESMTP id j4J0XljO028367; 
	Wed, 18 May 2005 18:33:48 -0600 (MDT)
Received: from sheplap.Central.Sun.COM (sheplap.Central.Sun.COM [10.1.194.251])
	by sfbaymail1sca.SFBay.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,v2.2) with
	ESMTP id j4J0Xljg022544; Wed, 18 May 2005 17:33:47 -0700 (PDT)
Received: by sheplap.Central.Sun.COM (Postfix, from userid 76367)
	id 86AC23C0CC3; Wed, 18 May 2005 19:34:03 -0500 (CDT)
Date: Wed, 18 May 2005 19:34:03 -0500
From: Spencer Shepler <spencer.shepler@sun.com>
To: rick@snowhite.cis.uoguelph.ca
Subject: Re: [nfsv4] re: NFS4ERR_OLD_STATEID and seqid
Message-ID: <20050519003403.GW12308@sheplap.Central.Sun.COM>
References: <200505171903.PAA02874@snowhite.cis.uoguelph.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200505171903.PAA02874@snowhite.cis.uoguelph.ca>
User-Agent: Mutt/1.4.1i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 39bd8f8cbb76cae18b7e23f7cf6b2b9f
Cc: nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: spencer.shepler@sun.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Tue, rick@snowhite.cis.uoguelph.ca wrote:
> > In section 8.1.5, we have the following:
> > 
> >    The client MUST monotonically increment the sequence number for the
> >    CLOSE, LOCK, LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE
> >    operations.  This is true even in the event that the previous
> >    operation that used the sequence number received an error.  The only
> >    exception to this rule is if the previous operation received one of
> >    the following errors: NFS4ERR_STALE_CLIENTID, NFS4ERR_STALE_STATEID,
> >    NFS4ERR_BAD_STATEID, NFS4ERR_BAD_SEQID, NFS4ERR_BADXDR,
> >    NFS4ERR_RESOURCE, NFS4ERR_NOFILEHANDLE.
> > 
> > It appears that NFS4ERR_OLD_STATEID is missing from the list of errors
> > that do not increment the sequence id.
> > 
> > Any objections to that interpretation?
> 
> My server currently does increment the seqid before returning
> NFS4ERR_OLD_STATEID. I can easily change that (it actually allows me
> to delete some code for that special case), but I'd like to hear that
> everyone is agreed first and that NFS4ERR_OLD_STATEID is being added to the
> list as an Errata change?

I have had one offlist confirmation of my proposal and my intent is that
this will be listed in the errata (unless there is an opposing opinion
not yet expressed).

> 
> rick
> ps: There are several other quirky cases I ran into, along with the above.
>     You might find ftp://ftp.cis.uoguelph.ca/pub/nfsv4/Quirky.cases of
>     interest.
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 19 16:24:04 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYrYq-0000hl-PO; Thu, 19 May 2005 16:24:04 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYrYp-0000hd-Da
	for nfsv4@megatron.ietf.org; Thu, 19 May 2005 16:24:03 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA27254
	for <nfsv4@ietf.org>; Thu, 19 May 2005 16:24:00 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from dargo.cs.uoguelph.ca ([131.104.96.159])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYrq0-0001Ci-48
	for nfsv4@ietf.org; Thu, 19 May 2005 16:41:51 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca
	[131.104.48.1])
	by dargo.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id j4JKNwdr013852
	for <nfsv4@ietf.org>; Thu, 19 May 2005 16:23:58 -0400
Received: (from rick@localhost)
	by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id QAA21238
	for nfsv4@ietf.org; Thu, 19 May 2005 16:25:29 -0400 (EDT)
Date: Thu, 19 May 2005 16:25:29 -0400 (EDT)
Message-Id: <200505192025.QAA21238@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Scanned-By: MIMEDefang 2.44
X-Spam-Score: 0.3 (/)
X-Scan-Signature: ea4ac80f790299f943f0a53be7e1a21a
Subject: [nfsv4] client re-use of lock_owner name
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

I just ran into this case when testing a very green client I've started
to work on:
"P1" on client:
- Opened "xx" with lock_owner4 name "P1" with seqid# 0
  - got a sequence of seqid#s started with OpenConfirm
  - did assorted Ops on "xx"
  - Closed OpenStateid for the above Open
a little later
- Opened "yy" with lock_owner4 name "P1", but with a fresh seqid# of 0

At this point my server replied NFS4ERR_BADSEQID.

Now, looking at the RFC (Sec. 8.1.7 and 8.1.8), it indicates the server
can choose to release the lock_owner when there is no outstanding state
associated with it, but doesn't seem to clarify if/when the client can
do something similar (aka forget the seqid#).

I've already changed my client to hold onto the lock_owner seqid# until
the associated process terminates (at the least it will reduce the number
of OpenConfirms required), but I am wondering what the correct server
response is for the above case?
Door #1 - NFS4ERR_BADSEQID as I currently do
OR
Door #2 - re-initialize the seqid# and require an OpenConfirm, since no state
    is currently associated with the open lock_owner

I am now leaning towards Door #2, since it seems harmless and would accomodate
clients that might choose to re-use lock_owner names.

So, should I take Door #1 or Door #2? rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 19 16:47:52 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYrvs-000079-9T; Thu, 19 May 2005 16:47:52 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYrvo-00006l-7k
	for nfsv4@megatron.ietf.org; Thu, 19 May 2005 16:47:49 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA29756
	for <nfsv4@ietf.org>; Thu, 19 May 2005 16:47:45 -0400 (EDT)
Received: from pat.uio.no ([129.240.130.16] ident=7411)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYsD1-0001uv-8L
	for nfsv4@ietf.org; Thu, 19 May 2005 17:05:36 -0400
Received: from mail-mx1.uio.no ([129.240.10.29])
	by pat.uio.no with esmtp (Exim 4.43)
	id 1DYrvi-000798-Cm; Thu, 19 May 2005 22:47:42 +0200
Received: from dh138.citi.umich.edu ([141.211.133.138])
	by mail-mx1.uio.no with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.43)
	id 1DYrvb-0003FG-DO; Thu, 19 May 2005 22:47:35 +0200
Subject: Re: [nfsv4] client re-use of lock_owner name
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: rick@snowhite.cis.uoguelph.ca
In-Reply-To: <200505192025.QAA21238@snowhite.cis.uoguelph.ca>
References: <200505192025.QAA21238@snowhite.cis.uoguelph.ca>
Content-Type: text/plain
Date: Thu, 19 May 2005 16:47:32 -0400
Message-Id: <1116535652.20456.12.camel@lade.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-UiO-Spam-info: not spam, SpamAssassin (score=-3.86, required 12,
	autolearn=disabled, AWL 1.14, UIO_MAIL_IS_INTERNAL -5.00)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 4adaf050708fb13be3316a9eee889caa
Content-Transfer-Encoding: 7bit
Cc: nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

to den 19.05.2005 Klokka 16:25 (-0400) skreiv
rick@snowhite.cis.uoguelph.ca:
> I just ran into this case when testing a very green client I've started
> to work on:
> "P1" on client:
> - Opened "xx" with lock_owner4 name "P1" with seqid# 0
>   - got a sequence of seqid#s started with OpenConfirm
>   - did assorted Ops on "xx"
>   - Closed OpenStateid for the above Open
> a little later
> - Opened "yy" with lock_owner4 name "P1", but with a fresh seqid# of 0
> 
> At this point my server replied NFS4ERR_BADSEQID.
> 
> Now, looking at the RFC (Sec. 8.1.7 and 8.1.8), it indicates the server
> can choose to release the lock_owner when there is no outstanding state
> associated with it, but doesn't seem to clarify if/when the client can
> do something similar (aka forget the seqid#).
> 
> I've already changed my client to hold onto the lock_owner seqid# until
> the associated process terminates (at the least it will reduce the number
> of OpenConfirms required), but I am wondering what the correct server
> response is for the above case?
> Door #1 - NFS4ERR_BADSEQID as I currently do
> OR
> Door #2 - re-initialize the seqid# and require an OpenConfirm, since no state
>     is currently associated with the open lock_owner
> 
> I am now leaning towards Door #2, since it seems harmless and would accomodate
> clients that might choose to re-use lock_owner names.
> 
> So, should I take Door #1 or Door #2? rick

You might be able to do Door #2 when NFSv4.1 sessions are implemented,
but in minor version 0, the seqid is there in order to allow the server
to detect replayed requests so that it can offer correct only-once
semantics.
If the client were to be allowed to forget the seqid as you suggest,
then some new mechanism would be needed in order to tell the server that
this is not a replay of the initial OPEN request.

So Door #1 looks correct for minor version 0.

Cheers,
  Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 19 16:58:37 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYs6H-0003JM-Jo; Thu, 19 May 2005 16:58:37 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYs6E-0003JD-Ii
	for nfsv4@megatron.ietf.org; Thu, 19 May 2005 16:58:35 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA00564
	for <nfsv4@ietf.org>; Thu, 19 May 2005 16:58:32 -0400 (EDT)
Received: from mx2.netapp.com ([216.240.18.37])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYsNS-0002BF-MN
	for nfsv4@ietf.org; Thu, 19 May 2005 17:16:23 -0400
Received: from smtp2.corp.netapp.com (10.57.159.114)
	by mx2.netapp.com with ESMTP; 19 May 2005 13:58:25 -0700
X-IronPort-AV: i="3.93,121,1115017200"; 
	d="scan'208"; a="216427352:sNHT22150120"
Received: from svlexc02.hq.netapp.com (svlexc02.corp.netapp.com
	[10.57.157.136])
	by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4JKwOVq006398; Thu, 19 May 2005 13:58:24 -0700 (PDT)
Received: from burgundy.hq.netapp.com ([10.56.10.66]) by
	svlexc02.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); 
	Thu, 19 May 2005 13:58:24 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by burgundy.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Thu, 19 May 2005 13:58:23 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] client re-use of lock_owner name
Date: Thu, 19 May 2005 16:58:22 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE4F@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] client re-use of lock_owner name
Thread-Index: AcVcsR/9XLJiYyB2QkyjYqU1gSPgfAAAtPNQ
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: <rick@snowhite.cis.uoguelph.ca>, <nfsv4@ietf.org>
X-OriginalArrivalTime: 19 May 2005 20:58:23.0867 (UTC)
	FILETIME=[7EF3CCB0:01C55CB5]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 944ecb6e61f753561f559a497458fb4f
Content-Transfer-Encoding: quoted-printable
Cc: 
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> I just ran into this case when testing a very green client I've
started
> to work on:
> "P1" on client:
> - Opened "xx" with lock_owner4 name "P1" with seqid# 0
>   - got a sequence of seqid#s started with OpenConfirm
>  - did assorted Ops on "xx"
>  - Closed OpenStateid for the above Open
> a little later
> - Opened "yy" with lock_owner4 name "P1", but with a fresh seqid# of 0
>
> At this point my server replied NFS4ERR_BADSEQID.
>
> Now, looking at the RFC (Sec. 8.1.7 and 8.1.8), it indicates the
server
> can choose to release the lock_owner when there is no outstanding
state
> associated with it, but doesn't seem to clarify if/when the client can
> do something similar (aka forget the seqid#).

I would say that it gives no indication that such a thing is possible.

You have a mechanism where both sides maintan a sequence number that has
to match and if one side just decides to forget the sequence number,
then that can't work.  The stuff about the server dropping it is
an exception because otherwise the server that the spec explicitly=20
makes since otherwise the server might have to keep an unbounded amount=20
of state.

> I've already changed my client to hold onto the lock_owner seqid#
until
> the associated process terminates (at the least it will reduce the
number
> of OpenConfirms required), but I am wondering what the correct server
> response is for the above case?
> Door #1 - NFS4ERR_BADSEQID as I currently do
> OR
> Door #2 - re-initialize the seqid# and require an OpenConfirm, since
no state
>     is currently associated with the open lock_owner
>=20
> I am now leaning towards Door #2, since it seems harmless and would
accomodate
> clients that might choose to re-use lock_owner names.

It may be harmless in terms of direct effect (I wouldn't swear to even
that),=20
but it would be a change in the protocol.  It would allow clients to
forget=20
the current sequence number and some servers wouldn't work with that, so
that when such clients tried to inter-operate with such servers, they
would
get an unpleasant surprise. =20

> So, should I take Door #1 or Door #2?

I think the right answer here is door #1 on the server with the client=20
making sure it does not reuse owner strings if it is forgetting the=20
sequence information.  It is pretty easy to unique-ify the strings to
prevent a problem when creating new owners.

-----Original Message-----
From: rick@snowhite.cis.uoguelph.ca
[mailto:rick@snowhite.cis.uoguelph.ca]=20
Sent: Thursday, May 19, 2005 4:25 PM
To: nfsv4@ietf.org
Subject: [nfsv4] client re-use of lock_owner name


I just ran into this case when testing a very green client I've started
to work on:
"P1" on client:
- Opened "xx" with lock_owner4 name "P1" with seqid# 0
  - got a sequence of seqid#s started with OpenConfirm
  - did assorted Ops on "xx"
  - Closed OpenStateid for the above Open
a little later
- Opened "yy" with lock_owner4 name "P1", but with a fresh seqid# of 0

At this point my server replied NFS4ERR_BADSEQID.

Now, looking at the RFC (Sec. 8.1.7 and 8.1.8), it indicates the server
can choose to release the lock_owner when there is no outstanding state
associated with it, but doesn't seem to clarify if/when the client can
do something similar (aka forget the seqid#).

I've already changed my client to hold onto the lock_owner seqid# until
the associated process terminates (at the least it will reduce the
number
of OpenConfirms required), but I am wondering what the correct server
response is for the above case?
Door #1 - NFS4ERR_BADSEQID as I currently do
OR
Door #2 - re-initialize the seqid# and require an OpenConfirm, since no
state
    is currently associated with the open lock_owner

I am now leaning towards Door #2, since it seems harmless and would
accomodate
clients that might choose to re-use lock_owner names.

So, should I take Door #1 or Door #2? rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 19 20:00:10 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DYuvy-0001Rz-69; Thu, 19 May 2005 20:00:10 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DYuvv-0001Qa-WC
	for nfsv4@megatron.ietf.org; Thu, 19 May 2005 20:00:08 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA18175
	for <nfsv4@ietf.org>; Thu, 19 May 2005 19:59:56 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from mailhub.cs.uoguelph.ca ([131.104.96.75])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DYvD0-0007I5-QU
	for nfsv4@ietf.org; Thu, 19 May 2005 20:17:48 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca
	[131.104.48.1])
	by mailhub.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id j4JNxsKV026627
	for <nfsv4@ietf.org>; Thu, 19 May 2005 19:59:54 -0400
Received: (from rick@localhost)
	by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id UAA22564
	for nfsv4@ietf.org; Thu, 19 May 2005 20:01:26 -0400 (EDT)
Date: Thu, 19 May 2005 20:01:26 -0400 (EDT)
Message-Id: <200505200001.UAA22564@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Scanned-By: MIMEDefang 2.44
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 7baded97d9887f7a0c7e8a33c2e3ea1b
Subject: [nfsv4] more re: client re-using lock_owner
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Well, I'm happy to continue using Door #1, since that's what my code already
does. However, I'm a little nervous that I've missed some fundamental
concept, since I can't see what the problem with Door #2 would be.

Here's an example scenario in more detail:
The client does...
1	Opens "xx" with lock_owner "P1" and seqid#0, gets Stateid0
2	Does an OpenConfirm (as required by the server) with seqid#1, so the
		seqid# is properly initialized, using Stateid0
3	Opens "yy" with lock_owner "P1" and seqid#2, gets Stateid1
4	Opens "zz" with lock_owner "P1" and seqid#3, gets Stateid2
5	Closes "xx" with seqid#4 and Stateid0
6	Closes "yy" with seqid#5 and Stateid1
7	Closes "zz" with seqid#6 and Stateid2

Now, I can see why #2 is needed and I can see why the seqid# has to
be correct for #2-#7.

But, it now seems that lock_owner "P1" is back to what I might call State0.
(No open/lock state associated with lock_owner "P1", which has an
 OpenConfirm'd seqid#.)
Continuing on...
8a	Server throws away lock_owner "P1" (allowed, but not required by
		Sec. 8.1.7 as I understand it)
8	Client opens "aa" with lock_owner "P1" and seqid#0

Now, if 8a has been done, the server would allow 8, requiring an
OpenConfirm for the new seqid# sequence starting at 0.

However, if 8a hasn't been done by the server, 8 will not be allowed and
the server will reply NFS4ERR_BADSEQID, for what I meant by Door #1.

Whereas what I meant by Door #2 was to allow 8, re-initializing the
seqid# sequence at 0 and requiring another OpenConfirm.
(Note that this case only occurs after at least the sequence
of Open, OpenConfirm and Close have occurred. I am not talking about the
case of just an unconfirmed Open.) Since only one of the Ops in sequence
could be replayed, I just can't see what the problem with doing this is?

Am I making sense? rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 08:49:45 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZ6wj-0001Yb-1S; Fri, 20 May 2005 08:49:45 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZ6we-0001Y6-Pu
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 08:49:42 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id IAA09885
	for <nfsv4@ietf.org>; Fri, 20 May 2005 08:49:38 -0400 (EDT)
Received: from mx2.netapp.com ([216.240.18.37])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZ7Dy-0000fp-Ir
	for nfsv4@ietf.org; Fri, 20 May 2005 09:07:36 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124)
	by mx2.netapp.com with ESMTP; 20 May 2005 05:49:29 -0700
X-IronPort-AV: i="3.93,123,1115017200"; 
	d="scan'208"; a="216700246:sNHT21057864"
Received: from svlexc02.hq.netapp.com (svlexc02.corp.netapp.com
	[10.57.157.136])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4KCnReN018235; Fri, 20 May 2005 05:49:28 -0700 (PDT)
Received: from lavender.hq.netapp.com ([10.56.11.75]) by
	svlexc02.hq.netapp.com with Microsoft SMTPSVC(5.0.2195.6713); 
	Fri, 20 May 2005 05:49:27 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by lavender.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Fri, 20 May 2005 05:49:27 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: client re-using lock_owner
Date: Fri, 20 May 2005 08:49:26 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: client re-using lock_owner
Thread-Index: AcVcz0Q4K8QtJGeVRDiBPQG9O2Pa1wAaEl1g
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: <rick@snowhite.cis.uoguelph.ca>, <nfsv4@ietf.org>
X-OriginalArrivalTime: 20 May 2005 12:49:27.0670 (UTC)
	FILETIME=[5BA12160:01C55D3A]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 41c17b4b16d1eedaa8395c26e9a251c4
Content-Transfer-Encoding: quoted-printable
Cc: 
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

OK, I see.  Since the spec allows the server to forget owner state
when there are no associated opens, it follows that returning OK
and requiring open-confirm cannot be illegal, since a server which
has dropped that open state will do precisely that.

However, even though it is not illegal, it still strikes me as
a very bad thing to do.  You have enough information to figure
out that the seqid is wrong so that it just seems better to tell=20
the client that, even though you might not be able to if you had
dropped that info.  With door #2 you have a server in which=20
getting a bad owner seqid when you have no opens brings on an=20
attack of amnesia, so you always respond as if you had dropped
the owner info.  =20

-----Original Message-----
From: rick@snowhite.cis.uoguelph.ca
[mailto:rick@snowhite.cis.uoguelph.ca]
Sent: Thursday, May 19, 2005 8:01 PM
To: nfsv4@ietf.org
Subject: [nfsv4] more re: client re-using lock_owner


Well, I'm happy to continue using Door #1, since that's what my code =
already
does. However, I'm a little nervous that I've missed some fundamental
concept, since I can't see what the problem with Door #2 would be.

Here's an example scenario in more detail:
The client does...
1	Opens "xx" with lock_owner "P1" and seqid#0, gets Stateid0
2	Does an OpenConfirm (as required by the server) with seqid#1, so the
		seqid# is properly initialized, using Stateid0
3	Opens "yy" with lock_owner "P1" and seqid#2, gets Stateid1
4	Opens "zz" with lock_owner "P1" and seqid#3, gets Stateid2
5	Closes "xx" with seqid#4 and Stateid0
6	Closes "yy" with seqid#5 and Stateid1
7	Closes "zz" with seqid#6 and Stateid2

Now, I can see why #2 is needed and I can see why the seqid# has to
be correct for #2-#7.

But, it now seems that lock_owner "P1" is back to what I might call =
State0.
(No open/lock state associated with lock_owner "P1", which has an
 OpenConfirm'd seqid#.)
Continuing on...
8a	Server throws away lock_owner "P1" (allowed, but not required by
		Sec. 8.1.7 as I understand it)
8	Client opens "aa" with lock_owner "P1" and seqid#0

Now, if 8a has been done, the server would allow 8, requiring an
OpenConfirm for the new seqid# sequence starting at 0.

However, if 8a hasn't been done by the server, 8 will not be allowed and
the server will reply NFS4ERR_BADSEQID, for what I meant by Door #1.

Whereas what I meant by Door #2 was to allow 8, re-initializing the
seqid# sequence at 0 and requiring another OpenConfirm.
(Note that this case only occurs after at least the sequence
of Open, OpenConfirm and Close have occurred. I am not talking about the
case of just an unconfirmed Open.) Since only one of the Ops in sequence
could be replayed, I just can't see what the problem with doing this is?

Am I making sense? rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 13:51:49 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZBf3-0008Ta-4m; Fri, 20 May 2005 13:51:49 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZBez-0008TQ-EC
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 13:51:47 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA15861
	for <nfsv4@ietf.org>; Fri, 20 May 2005 13:51:44 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZBwI-00017q-8i
	for nfsv4@ietf.org; Fri, 20 May 2005 14:09:45 -0400
Received: from phys-aus08-1 ([129.153.131.88])
	by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j4KHpU0R011397
	for <nfsv4@ietf.org>; Fri, 20 May 2005 11:51:30 -0600 (MDT)
Received: from conversion-daemon.aus08-mail1.central.sun.com by
	aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	id <0IGS00H01UBRRG@aus08-mail1.central.sun.com>
	(original mail from David.Robinson@Sun.COM) for nfsv4@ietf.org; Fri,
	20 May 2005 12:51:30 -0500 (CDT)
Received: from [129.153.128.60] (jetsun.Central.Sun.COM [129.153.128.60])
	by aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	with ESMTP id <0IGS007XAUXSQO@aus08-mail1.central.sun.com>; Fri,
	20 May 2005 12:51:30 -0500 (CDT)
Date: Fri, 20 May 2005 12:51:28 -0500
From: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
In-reply-to: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
To: "Noveck, Dave" <Dave.Noveck@netapp.com>
Message-id: <428E23A0.9090300@sun.com>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: 7BIT
X-Accept-Language: en-us, en
User-Agent: Mozilla Thunderbird 1.0 (X11/20050129)
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: f607d15ccc2bc4eaf3ade8ffa8af02a0
Content-Transfer-Encoding: 7BIT
Cc: nfsv4@ietf.org, rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Noveck, Dave wrote:
> OK, I see.  Since the spec allows the server to forget owner state
> when there are no associated opens, it follows that returning OK
> and requiring open-confirm cannot be illegal, since a server which
> has dropped that open state will do precisely that.
> 
> However, even though it is not illegal, it still strikes me as
> a very bad thing to do.  You have enough information to figure
> out that the seqid is wrong so that it just seems better to tell 
> the client that, even though you might not be able to if you had
> dropped that info.  With door #2 you have a server in which 
> getting a bad owner seqid when you have no opens brings on an 
> attack of amnesia, so you always respond as if you had dropped
> the owner info.   

This seens unduly harsh. Section 8.1.8 says that the server upon
seeing a lock_owner for the "first time" will require an
OPEN_CONFIRM. But the spec fails to talk very clearly
about what is the "first time". It is clear that while there
is active state for that lock_owner it can't be the first
time and the server MUST return back NFS4ERR_BADSEQID, the
client has screwed up state that it is required to maintain.

However, if there is no active state, how long a time period or
what events must occur before the reuse of a lock_owner
qualifies to be the "first time"? And how is a client expected
to be able to reliably determine that fact?  The other factor is
that if the client had done something major like rebooted and
lost any history, it may not have any way to construct a valid
seqid so it must start with a new lock_owner. Seemingly harmless
to do such a thing, but I can imagine scenerios where a client
would rather not keep trying new values until it hit one that
worked. Eventually the server will forget that lock_owner and
will think it is the "first time", so why not make that a
more pro-active event?

So I would like Door #2 Monty...

Reasonable behavior should be:

1) Client MUST use the correct next seqid if there is active state
    for that lock_owner.
2) Client SHOULD use the correct next seqid if there is no
    active state for that lock_owner.
3) The server MUST return NFS4ERR_BADSEQID if the OPEN lock_owner
    has an incorrect seqid and there is active state for that
    lock_owner.
4) The server SHOULD allow the OPEN if the lock_owner has the
    correct seqid and there is no active state for that lock_owner.
5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
    lock_owner has an incorrect seqid and there is no active state
    for that lock_owner.

So in normal processing everyone SHOULD make a reasonable effort
to maintain enough information to minimize the need to use
OPEN_CONFIRM. But in the absence of active state, if one side
forgets, it is useful to allow the lock_owner to be reused
by sending OPEN4_RESULT_CONFIRM instead of NFS4ERR_BADSEQID.

	-David

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 14:01:46 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZBog-0004nE-FV; Fri, 20 May 2005 14:01:46 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZBoe-0004kC-KM
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 14:01:44 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA17406
	for <nfsv4@ietf.org>; Fri, 20 May 2005 14:01:43 -0400 (EDT)
Received: from dsl093-002-214.det1.dsl.speakeasy.net ([66.93.2.214]
	helo=pickle.fieldses.org) by ietf-mx.ietf.org with esmtp (Exim 4.33)
	id 1DZC63-0001Y5-4y
	for nfsv4@ietf.org; Fri, 20 May 2005 14:19:44 -0400
Received: from bfields by pickle.fieldses.org with local (Exim 4.50)
	id 1DZBoZ-0000rg-Ru; Fri, 20 May 2005 14:01:39 -0400
Date: Fri, 20 May 2005 14:01:39 -0400
To: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
Message-ID: <20050520180139.GA2423@fieldses.org>
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <428E23A0.9090300@sun.com>
User-Agent: Mutt/1.5.9i
From: "J. Bruce Fields" <bfields@fieldses.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: de4f315c9369b71d7dd5909b42224370
Cc: rick@snowhite.cis.uoguelph.ca, "Noveck, Dave" <Dave.Noveck@netapp.com>,
	nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On Fri, May 20, 2005 at 12:51:28PM -0500, David Robinson wrote:
> However, if there is no active state, how long a time period or
> what events must occur before the reuse of a lock_owner
> qualifies to be the "first time"? And how is a client expected
> to be able to reliably determine that fact?

Why would it need to?  I can't see why a client would need to reuse a
lock_owner.

> The other factor is that if the client had done something major like
> rebooted and lost any history, it may not have any way to construct a
> valid seqid so it must start with a new lock_owner.

In this case it's going to do a setclientid and blow away all its old
state anyway, isn't it?

--b.

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 14:31:14 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZCHC-0004GD-Oz; Fri, 20 May 2005 14:31:14 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZCHB-0004G3-R9
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 14:31:13 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA20854
	for <nfsv4@ietf.org>; Fri, 20 May 2005 14:31:12 -0400 (EDT)
Received: from pat.uio.no ([129.240.130.16] ident=7411)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZCYb-0002S5-Ay
	for nfsv4@ietf.org; Fri, 20 May 2005 14:49:13 -0400
Received: from mail-mx2.uio.no ([129.240.10.30])
	by pat.uio.no with esmtp (Exim 4.43)
	id 1DZCH5-0003zq-RK; Fri, 20 May 2005 20:31:07 +0200
Received: from dh138.citi.umich.edu ([141.211.133.138])
	by mail-mx2.uio.no with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.43)
	id 1DZCGz-0007bg-1w; Fri, 20 May 2005 20:31:01 +0200
Subject: Re: [nfsv4] more re: client re-using lock_owner
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: David Robinson <David.Robinson@Sun.COM>
In-Reply-To: <428E23A0.9090300@sun.com>
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com>
Content-Type: text/plain
Date: Fri, 20 May 2005 14:30:55 -0400
Message-Id: <1116613855.15684.29.camel@lade.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-UiO-Spam-info: not spam, SpamAssassin (score=-3.819, required 12,
	autolearn=disabled, AWL 1.18, UIO_MAIL_IS_INTERNAL -5.00)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b4a0a5f5992e2a4954405484e7717d8c
Content-Transfer-Encoding: 7bit
Cc: rick@snowhite.cis.uoguelph.ca, "Noveck, Dave" <Dave.Noveck@netapp.com>,
	nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

fr den 20.05.2005 Klokka 12:51 (-0500) skreiv David Robinson:

> This seens unduly harsh. Section 8.1.8 says that the server upon
> seeing a lock_owner for the "first time" will require an
> OPEN_CONFIRM. But the spec fails to talk very clearly
> about what is the "first time". It is clear that while there
> is active state for that lock_owner it can't be the first
> time and the server MUST return back NFS4ERR_BADSEQID, the
> client has screwed up state that it is required to maintain.
> 
> However, if there is no active state, how long a time period or
> what events must occur before the reuse of a lock_owner
> qualifies to be the "first time"? And how is a client expected
> to be able to reliably determine that fact?  The other factor is
> that if the client had done something major like rebooted and
> lost any history, it may not have any way to construct a valid
> seqid so it must start with a new lock_owner. Seemingly harmless
> to do such a thing, but I can imagine scenerios where a client
> would rather not keep trying new values until it hit one that
> worked. Eventually the server will forget that lock_owner and
> will think it is the "first time", so why not make that a
> more pro-active event?

Errm.. If the client has rebooted, then the server is expected to clear
all state as soon as the SETCLIENTID negotiation is done.

As for the "first time" issue: our client, at least, is coded so that we
don't reuse the same open owner strings. Encoding a uniquifier is hardly
an unsolvable problem, and it also ensures that we don't end up
confusing the replay caches on the server.

> So I would like Door #2 Monty...
> 
> Reasonable behavior should be:
> 
> 1) Client MUST use the correct next seqid if there is active state
>     for that lock_owner.
> 2) Client SHOULD use the correct next seqid if there is no
>     active state for that lock_owner.
> 3) The server MUST return NFS4ERR_BADSEQID if the OPEN lock_owner
>     has an incorrect seqid and there is active state for that
>     lock_owner.
> 4) The server SHOULD allow the OPEN if the lock_owner has the
>     correct seqid and there is no active state for that lock_owner.
> 5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
>     lock_owner has an incorrect seqid and there is no active state
>     for that lock_owner.

I assume you too mean "open_owner" and not lock_owner?

So assume 2 OPEN requests come in for the same open_owner. One turns out
to be a replay of an old OPEN from a previous generation of the same
open owner, and the other one is an actual new request: how should the
server deal with it assuming rule (5)?

AFAICS, this would appear to be basically the same problem that TCP has
on connections to the same port. Once you close the connection, then you
need a suitable moratorium period in order to allow replays to die out
before you allow a new connection. The length of that moratorium,
however, is more of a "best practices" issue rather than a protocol
issue since it depends on the nature of the transport used (10GigE,
10Mbit, multipathing, really old and stupid routers, etc.).

Cheers,
  Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 14:40:31 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZCQA-0008To-UL; Fri, 20 May 2005 14:40:30 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZCQA-0008Te-59
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 14:40:30 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA22555
	for <nfsv4@ietf.org>; Fri, 20 May 2005 14:40:28 -0400 (EDT)
From: email2mre-ietf@yahoo.com
Received: from web30312.mail.mud.yahoo.com ([68.142.201.230])
	by ietf-mx.ietf.org with smtp (Exim 4.33) id 1DZChX-0002u9-PI
	for nfsv4@ietf.org; Fri, 20 May 2005 14:58:30 -0400
Received: (qmail 83032 invoked by uid 60001); 20 May 2005 18:40:16 -0000
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	b=oqG7XkBl7ZbMoJbleUYSVDErBNrOl0u1/nP2N0UDOFrGg0W6RXNUx7YhmH9jwThY+go6uq1izlabr0ofdSgHdrNtUo5iv35WnlVYsjDlT3t56Esip3kBHj4BOWfFm7pY9ufE8ylV0G7Qp9VtaeWosPkd2oGa16NWaR6H/NRBciM=
	; 
Message-ID: <20050520184016.83030.qmail@web30312.mail.mud.yahoo.com>
Received: from [198.95.226.224] by web30312.mail.mud.yahoo.com via HTTP;
	Fri, 20 May 2005 11:40:16 PDT
Date: Fri, 20 May 2005 11:40:16 -0700 (PDT)
Subject: RE: [nfsv4] more re: client re-using lock_owner
To: nfsv4@ietf.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Spam-Score: 0.8 (/)
X-Scan-Signature: 92df29fa99cf13e554b84c8374345c17
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: email2mre-ietf@yahoo.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> From: David Robinson [mailto:David.Robinson@Sun.COM] 

> Noveck, Dave wrote:

> > However, even though it is not illegal, it still strikes me as
> > a very bad thing to do.  You have enough information to figure
> > out that the seqid is wrong so that it just seems better to tell 
> > the client that, even though you might not be able to if you had
> > dropped that info.  With door #2 you have a server in which 
> > getting a bad owner seqid when you have no opens brings on an 
> > attack of amnesia, so you always respond as if you had dropped
> > the owner info.   
> 
> This seens unduly harsh. Section 8.1.8 says that the server upon
> seeing a lock_owner for the "first time" will require an
> OPEN_CONFIRM. But the spec fails to talk very clearly
> about what is the "first time". It is clear that while there
> is active state for that lock_owner it can't be the first
> time and the server MUST return back NFS4ERR_BADSEQID, the
> client has screwed up state that it is required to maintain.
> 
> However, if there is no active state, how long a time period or
> what events must occur before the reuse of a lock_owner
> qualifies to be the "first time"? And how is a client expected
> to be able to reliably determine that fact?  The other factor is
> that if the client had done something major like rebooted and
> lost any history, it may not have any way to construct a valid
> seqid so it must start with a new lock_owner. Seemingly harmless
> to do such a thing, but I can imagine scenerios where a client
> would rather not keep trying new values until it hit one that
> worked. Eventually the server will forget that lock_owner and
> will think it is the "first time", so why not make that a
> more pro-active event?

I don't see how this particular situation applies since the
rebooted client wil have issued a SETCLIENTID sequence and thus caused the
server to destroy the open_owner, lock_owner, etc. state associated
with it.

We have a RELEASE_LOCKOWNER operation, but no RELEASE_OPENOWNER
operation. And despite the subject of this thread, the issue is open_owners,
lock_owners.

> 
> So I would like Door #2 Monty...

Here's what bothers me about #2. Let's say reason for
the re-used open_owner is not because the client has
forgotten, but because of a retry of an OPEN,
but the client still remembers the open_owner state.
If we return NFS4ERR_BADSEQID,
then this doesn't perturb the existing sequence number state
for the open_owner. If we request open confirmation, then
thr server has two choices:

1. Perturb the state. So if the client in fact had not forgotten
   about the open_owner's previous sequence number, the next time
   client goes to use a sequence number from the previous use of
   the open_owner, he gets NFS4ERR_BADSEQID. And gets very
   confused.

2. Provisionally perturb the state to address the issue of (1)
   above. The server has to maintain two parallel states 
   (similar to Schrodinger's Cat experiment), one
   of which collapses based on a subsequence use. 
   The original sequence number state collapses
   when the client follows up with the OPEN_CONFIRM to be replaced with
   the new sequence. Or, the provisional new sequence number state collapses,
   with the original state continued, once the the original sequence number
   is used for another stated changing operating.


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 14:58:48 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZChs-0006aE-SO; Fri, 20 May 2005 14:58:48 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZChq-0006a9-9J
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 14:58:47 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA25449
	for <nfsv4@ietf.org>; Fri, 20 May 2005 14:58:44 -0400 (EDT)
From: rick@snowhite.cis.uoguelph.ca
Received: from moe.cs.uoguelph.ca ([131.104.96.55])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZCzF-0003hP-95
	for nfsv4@ietf.org; Fri, 20 May 2005 15:16:46 -0400
Received: from snowhite.cis.uoguelph.ca (snowhite.cis.uoguelph.ca
	[131.104.48.1])
	by moe.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id j4KIwhng005425
	for <nfsv4@ietf.org>; Fri, 20 May 2005 14:58:43 -0400
Received: (from rick@localhost)
	by snowhite.cis.uoguelph.ca (8.9.3/8.9.3) id PAA30604
	for nfsv4@ietf.org; Fri, 20 May 2005 15:00:14 -0400 (EDT)
Date: Fri, 20 May 2005 15:00:14 -0400 (EDT)
Message-Id: <200505201900.PAA30604@snowhite.cis.uoguelph.ca>
To: nfsv4@ietf.org
X-Scanned-By: MIMEDefang 2.44
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 30ac594df0e66ffa5a93eb4c48bcb014
Subject: [nfsv4] more re: lock_owner re-use
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> operation. And despite the subject of this thread, the issue is open_owners,
> lock_owners.

Yes, I meant "open_owner". In my defence, I will note that the term lock_owner
is used throughout Sec. 8, up until 8.1.8, where open_owner makes its first
appearance.

Glad to see I stirred up a little controversy. Looks like Door #1 is the
winner sofar, rick

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 19:24:17 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZGqn-0002m9-Co; Fri, 20 May 2005 19:24:17 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZGqh-0002kL-TJ
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 19:24:13 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA26450
	for <nfsv4@ietf.org>; Fri, 20 May 2005 19:24:08 -0400 (EDT)
Received: from mx1.netapp.com ([216.240.18.38])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZH89-0004IW-SB
	for nfsv4@ietf.org; Fri, 20 May 2005 19:42:14 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124)
	by mx1.netapp.com with ESMTP; 20 May 2005 16:24:02 -0700
X-IronPort-AV: i="3.93,125,1115017200"; 
	d="scan'208"; a="172285447:sNHT19339948"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com
	[10.57.156.149])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4KNLC4h001872; Fri, 20 May 2005 16:24:01 -0700 (PDT)
Received: from burgundy.hq.netapp.com ([10.56.10.66]) by
	svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); 
	Fri, 20 May 2005 16:23:12 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by burgundy.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Fri, 20 May 2005 16:23:11 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: client re-using lock_owner
Date: Fri, 20 May 2005 19:23:10 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE5E@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: client re-using lock_owner
Thread-Index: AcVdZI+ccTI7hc/kRMmZW64hpgqy2AACC4bw
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: "David Robinson" <David.Robinson@sun.com>
X-OriginalArrivalTime: 20 May 2005 23:23:11.0883 (UTC)
	FILETIME=[E3D395B0:01C55D92]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 225414c974e0d6437992164e91287a51
Content-Transfer-Encoding: quoted-printable
Cc: nfsv4@ietf.org, rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

David Robinson wrote:
> Noveck, Dave wrote:
> > OK, I see.  Since the spec allows the server to forget owner state
> > when there are no associated opens, it follows that returning OK
> > and requiring open-confirm cannot be illegal, since a server which
> > has dropped that open state will do precisely that.
> >=20
> > However, even though it is not illegal, it still strikes me as
> > a very bad thing to do.  You have enough information to figure
> > out that the seqid is wrong so that it just seems better to tell=20
> > the client that, even though you might not be able to if you had
> > dropped that info.  With door #2 you have a server in which=20
> > getting a bad owner seqid when you have no opens brings on an=20
> > attack of amnesia, so you always respond as if you had dropped
> > the owner info.  =20
>=20
> This seens unduly harsh. Section 8.1.8 says that the server upon
> seeing a lock_owner for the "first time" will require an
> OPEN_CONFIRM. But the spec fails to talk very clearly
> about what is the "first time". It is clear that while there
> is active state for that lock_owner it can't be the first
> time and the server MUST return back NFS4ERR_BADSEQID, the
> client has screwed up state that it is required to maintain.

> However, if there is no active state, how long a time period or
> what events must occur before the reuse of a lock_owner
> qualifies to be the "first time"?=20
=20
The spec gives no constraints on this.  A server may drop the
owner state immediately after the last close of an open for this
owner if it wants.  The only consequence of doing that would be
a lot of extra OPEN_CONFIRM's and thus bad performance.  It would
be legal however. =20

In this case, we are talking about, where the hypothesis is that=20
the server does indeed have information about the openowner, it=20
is clear this is not the "first time" the server is seeing that
owner and the server knows that.  The only reason that this is=20
not illegal is that judgments of spec legality are subject to=20
an epistemological constraint.  They may not refer to details
of the internals of the server, even though we may very well
know about them.  If something is illegal it must determinable
from behavior visible over the wire.  But note that I am not
saying this is illegal.  I'm just saying it seems bad to me
and I think I may take into account what we know about the=20
actual situation when we discuss those questions.=20


> And how is a client expected
> to be able to reliably determine that fact? =20

He doesn't need to.  He is free to drop owners at any time as long
as he doesn't reuse the owner string after dropping that owner,
and that isn't very hard to arrange.  If he is referring to a new
owner, and it would necessarily be a new owner if there is a new
sequence space, then the new owner must be associated with a new
opaque string.  You can't use the same opaque string for two
different owners.

> The other factor is
> that if the client had done something major like rebooted and
> lost any history, it may not have any way to construct a valid
> seqid so it must start with a new lock_owner.=20

What does it use for a clientid?  If it does a SETCLIENTID, the
owner state is gone and he has a fresh slate.  If he wants to=20
use the same clientid, then he can write it to disk, and pick it
up after the reboot. but if he wants to do that, he'd better also=20
write the owner state persistently as well.  If he can't do that,=20
he has to start fresh with a SETCLIENTID.

> Seemingly harmless
> to do such a thing, but I can imagine scenerios where a client
> would rather not keep trying new values until it hit one that
> worked. Eventually the server will forget that lock_owner and
> will think it is the "first time", so why not make that a
> more pro-active event?

It is easy to create a new lockowner all the time.  Just append the
time you created it and it will be unique.

The point here is that the client is not helpless.  The server will
only know about owners that the client himself created.  While there
are not an infinite set of possible owner strngs, there are eight to
the two to the thirtysecond, which is pretty close to infinity for
practical purposes.

> So I would like Door #2 Monty...

Is is legal.  You may go through the door.  If I say overly harsh
things about your choice, don't take it personally.  I just don't
think it is the right choice, even though I agree it is legal.

> Reasonable behavior should be:

We need to separate what someone thinks is reasonable from what
is in RFC3530.  If you think the spec should have allowed clients
to forget state at will, then that is one thing.  But I don't think=20
the spec actually did that.

> 1) Client MUST use the correct next seqid if there is active state
>    for that lock_owner.
> 2) Client SHOULD use the correct next seqid if there is no
>    active state for that lock_owner.

Strictly speaking, the client may send any seqid he wants but if it
is not the correct one, he has to be prepared for BADSEQID.  If there
is open state and he uses an incorrect seqid, then he may rely on=20
getting BADSEQID.  If there is no open, he may not rely on getting that
error, since the server may have forgotten the owner state.

> 3) The server MUST return NFS4ERR_BADSEQID if the OPEN lock_owner
>    has an incorrect seqid and there is active state for that
>    lock_owner.

Agree.

> 4) The server SHOULD allow the OPEN if the lock_owner has the
>    correct seqid and there is no active state for that lock_owner.

He must process the open normally.  Whether it allowed depends on
lots of stuff.

> 5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
>    lock_owner has an incorrect seqid and there is no active state
>    for that lock_owner.

Why?  If you assert that this is somehow required by RFC3530, then
where exactly does it say that?  I think it says that if it knows
a seqid is invalid it should return BADSEQID.  I don't think the fact
that it might sometimes not know that changes that.  It may be
that 5) could have been added to the spec without harm, but it
wasn't.

As far as the (MUST?) alternative, how can you say you "SHOULD"
do the open if he has the correct seqid and "MUST" do it if he
has an incorrect seqid?

> So in normal processing everyone SHOULD make a reasonable effort
> to maintain enough information to minimize the need to use
> OPEN_CONFIRM. But in the absence of active state, if one side
> forgets, it is useful to allow the lock_owner to be reused
> by sending OPEN4_RESULT_CONFIRM instead of NFS4ERR_BADSEQID.

I don't see how it is useful.  The client should not reuse the
same opaque string for different owners and if he forgets the
seqid, then these are different owners.=20

You may think so but the fact is that the spec discusses the
case of the server forgetting the stateid and never mentions
the case of the client forgetting it.  If you think that that is
an oversight, then you need an argument to show why this is a
necessary case to deal with.  I assert that it is not, since the
cases of the server and the client are not parallel.  The client
decides on the lifetime of owners and the server has no way of
of knowing when the client is done with one.  This requires special
handling to deal with the case of a server who is forced to drop
state.  The client doesn't have that problem.  He may drop owners
at will with no special protocol provision, as long as he doesn't
try to forget the seqid and then use that same owner string again.
=20

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Fri May 20 19:52:18 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZHHu-000107-Ed; Fri, 20 May 2005 19:52:18 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZHHs-000101-Fr
	for nfsv4@megatron.ietf.org; Fri, 20 May 2005 19:52:16 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA28063
	for <nfsv4@ietf.org>; Fri, 20 May 2005 19:52:13 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZHZK-0004r0-Sk
	for nfsv4@ietf.org; Fri, 20 May 2005 20:10:19 -0400
Received: from phys-aus08-1 ([129.153.131.88])
	by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j4KNqE0R009058
	for <nfsv4@ietf.org>; Fri, 20 May 2005 17:52:15 -0600 (MDT)
Received: from conversion-daemon.aus08-mail1.central.sun.com by
	aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	id <0IGT00M01AE7XK@aus08-mail1.central.sun.com>
	(original mail from David.Robinson@Sun.COM) for nfsv4@ietf.org; Fri,
	20 May 2005 18:52:14 -0500 (CDT)
Received: from [129.153.128.60] (jetsun.Central.Sun.COM [129.153.128.60])
	by aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	with ESMTP id <0IGT0048VBN22L@aus08-mail1.central.sun.com>; Fri,
	20 May 2005 18:52:14 -0500 (CDT)
Date: Fri, 20 May 2005 18:52:14 -0500
From: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
In-reply-to: <20050520180139.GA2423@fieldses.org>
To: nfsv4@ietf.org
Message-id: <428E782E.40304@sun.com>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: 7BIT
X-Accept-Language: en-us, en
User-Agent: Mozilla Thunderbird 1.0 (X11/20050129)
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 5ebbf074524e58e662bc8209a6235027
Content-Transfer-Encoding: 7BIT
Cc: rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

J. Bruce Fields wrote:

>>However, if there is no active state, how long a time period or
>>what events must occur before the reuse of a lock_owner
>>qualifies to be the "first time"? And how is a client expected
>>to be able to reliably determine that fact?

> Why would it need to?  I can't see why a client would need to reuse a
> lock_owner.

There is no need to, but I don't see any reason to force them to
when the open_owner is not related to any active state.

>>The other factor is that if the client had done something major like
>>rebooted and lost any history, it may not have any way to construct a
>>valid seqid so it must start with a new lock_owner.

> In this case it's going to do a setclientid and blow away all its old
> state anyway, isn't it?

As others have pointed out, bad example, I withdraw it. But I think
the points still hold.

Trond Myklebust wrote:

 > Errm.. If the client has rebooted, then the server is expected to clear
 > all state as soon as the SETCLIENTID negotiation is done.

yup....

:g/lock_owner/open_owner/

 > As for the "first time" issue: our client, at least, is coded so that we
 > don't reuse the same open owner strings. Encoding a uniquifier is hardly
 > an unsolvable problem, and it also ensures that we don't end up
 > confusing the replay caches on the server.

Yes, it is not hard to create new open_owners, but is it required?
I don't see how this will affect reply caches as those should be
based on XID and IP addresses.

 > So assume 2 OPEN requests come in for the same open_owner. One turns out
 > to be a replay of an old OPEN from a previous generation of the same
 > open owner, and the other one is an actual new request: how should the
 > server deal with it assuming rule (5)?

If there is a replay, then it is presumed that the first OPEN
succeeded and there is active state which means rule #5
doesn't come into play.  It is rule #3 and NFS4ERR_BADSEQID as
some servers are already doing today. If the initial request
never actually made it to the server, then there is no active
state and client never saw a reply, so everything just
works as if it were the first request.

 > AFAICS, this would appear to be basically the same problem that TCP has
 > on connections to the same port. Once you close the connection, then you
 > need a suitable moratorium period in order to allow replays to die out
 > before you allow a new connection. The length of that moratorium,
 > however, is more of a "best practices" issue rather than a protocol
 > issue since it depends on the nature of the transport used (10GigE,
 > 10Mbit, multipathing, really old and stupid routers, etc.).

But in TCP we have only the sequence number to defend against replays,
in NFS we have the RPC level XID and the open_owner seqid space
to defend against wayward packets (also it would have to be on a new
TCP connection).

I assume this is Mike's obfuscated reply address.

email2mre-ietf@yahoo.com wrote:
 > Here's what bothers me about #2. Let's say reason for
 > the re-used open_owner is not because the client has
 > forgotten, but because of a retry of an OPEN,
 > but the client still remembers the open_owner state.

If the previous OPEN suceeded the server will have state
so the replay must cause NFS4ERR_BADSEQID and never an
open confirm.  So this will act just as you expect.

 > If we return NFS4ERR_BADSEQID,
 > then this doesn't perturb the existing sequence number state
 > for the open_owner. If we request open confirmation, then
 > thr server has two choices:

 > 1. Perturb the state. So if the client in fact had not forgotten
 >    about the open_owner's previous sequence number, the next time
 >    client goes to use a sequence number from the previous use of
 >    the open_owner, he gets NFS4ERR_BADSEQID. And gets very
 >    confused.

This would only happen if we have a buggy client. If the server
requests an open_confirm he is telling the client that they
are establishing new state. If a client holds onto
its old state  after getting a OPEN4_RESULT_CONFIRM it is
just broken.

 > 2. Provisionally perturb the state to address the issue of (1)
 >    above. The server has to maintain two parallel states
 >    (similar to Schrodinger's Cat experiment), one
 >    of which collapses based on a subsequence use.

This shouldn't be needed. The server either has active state to which
it must reply NFS4ERR_BADSEQID if it is incorrect, or it
has no active state to which it requests an open_confirm if
it is not the next seqid.

When the server has no active state, the visible response to the
client is no different than if the OPEN was the first one seen
by the server and the open_owner can be safely reused. The only
difference is if the client uses the next seqid for the open_owner
the open_confirm is unneeded, but this is what we have already.

	-David

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Sat May 21 07:22:23 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZS3j-0003HY-MK; Sat, 21 May 2005 07:22:23 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZS3i-0003GP-Ib
	for nfsv4@megatron.ietf.org; Sat, 21 May 2005 07:22:22 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA01415
	for <nfsv4@ietf.org>; Sat, 21 May 2005 07:22:17 -0400 (EDT)
Received: from pat.uio.no ([129.240.130.16] ident=7411)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZKr7-0004OO-TP
	for nfsv4@ietf.org; Fri, 20 May 2005 23:40:55 -0400
Received: from mail-mx6.uio.no ([129.240.10.47])
	by pat.uio.no with esmtp (Exim 4.43)
	id 1DZKZJ-0000wB-6m; Sat, 21 May 2005 05:22:29 +0200
Received: from dh138.citi.umich.edu ([141.211.133.138])
	by mail-mx6.uio.no with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.43)
	id 1DZKZG-0001TA-Uz; Sat, 21 May 2005 05:22:27 +0200
Subject: Re: [nfsv4] more re: client re-using lock_owner
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: David Robinson <David.Robinson@Sun.COM>
In-Reply-To: <428E782E.40304@sun.com>
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
	<428E782E.40304@sun.com>
Content-Type: text/plain
Date: Fri, 20 May 2005 23:22:20 -0400
Message-Id: <1116645740.15684.84.camel@lade.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-UiO-Spam-info: not spam, SpamAssassin (score=-3.976, required 12,
	autolearn=disabled, AWL 1.02, UIO_MAIL_IS_INTERNAL -5.00)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: cd26b070c2577ac175cd3a6d878c6248
Content-Transfer-Encoding: 7bit
Cc: nfsv4@ietf.org, rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

fr den 20.05.2005 Klokka 18:52 (-0500) skreiv David Robinson:
> Yes, it is not hard to create new open_owners, but is it required?
> I don't see how this will affect reply caches as those should be
> based on XID and IP addresses.

NFSv4 does not rely on RPC to provide only-once semantics. See section
8.1.5 and 8.1.6.

Note: this is one of the areas where RFC3530 also confuses lockowners
and open_owners. Those sections discuss the behaviour for CLOSE, LOCK,
LOCKU, OPEN, OPEN_CONFIRM, and OPEN_DOWNGRADE, so they are talking about
both - it is a historical accident that RFC3010 only talked about
lockowners

>  > So assume 2 OPEN requests come in for the same open_owner. One turns out
>  > to be a replay of an old OPEN from a previous generation of the same
>  > open owner, and the other one is an actual new request: how should the
>  > server deal with it assuming rule (5)?
> 
> If there is a replay, then it is presumed that the first OPEN
> succeeded and there is active state which means rule #5
> doesn't come into play.  It is rule #3 and NFS4ERR_BADSEQID as
> some servers are already doing today. If the initial request
> never actually made it to the server, then there is no active
> state and client never saw a reply, so everything just
> works as if it were the first request.

Why? What is stopping something like the following from happening:

<connection 1>
OPEN
||||
<network partition or server thread hangs>
			<client initiates new connection>
					OPEN
					CLOSE

					OPEN
<network partition/hang clears>
||||
first OPEN is processed on server

You lose all ordering guarantees when the client is expected to
reconnect every time it replays a request.

>  > AFAICS, this would appear to be basically the same problem that TCP has
>  > on connections to the same port. Once you close the connection, then you
>  > need a suitable moratorium period in order to allow replays to die out
>  > before you allow a new connection. The length of that moratorium,
>  > however, is more of a "best practices" issue rather than a protocol
>  > issue since it depends on the nature of the transport used (10GigE,
>  > 10Mbit, multipathing, really old and stupid routers, etc.).
> 
> But in TCP we have only the sequence number to defend against replays,
> in NFS we have the RPC level XID and the open_owner seqid space
> to defend against wayward packets (also it would have to be on a new
> TCP connection).

NFSv4 does _not_ allow you to rely on XIDs. (See the sections listed
above.) Rather, the rules for ensuring only-once semantics are given by
the paragraph on Page 73 that says:

        If a request (r) with a previous sequence number (r < L) is
        received, it is rejected with the return of error
        NFS4ERR_BAD_SEQID.  Given a properly-functioning client, the
        response to (r) must have been received before the last request
        (L) was sent. If a duplicate of last request (r == L) is
        received, the stored response is returned. If a request beyond
        the next sequence (r == L + 2) is received, it is rejected with
        the return of error NFS4ERR_BAD_SEQID.

I can see no mention anywhere in the RFC of the word "XID", nor do I see
anything that would allow the server to assume it can override the above
rule should the client resend an RPC request using a different XID or on
a different TCP connection.

Cheers,
 Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Sat May 21 08:08:31 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZSmM-0001jm-VQ; Sat, 21 May 2005 08:08:30 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZSmL-0001hw-KV
	for nfsv4@megatron.ietf.org; Sat, 21 May 2005 08:08:29 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA06314
	for <nfsv4@ietf.org>; Sat, 21 May 2005 07:41:44 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZIQ4-0007p2-Rt
	for nfsv4@ietf.org; Fri, 20 May 2005 21:04:49 -0400
Received: from phys-aus08-1 ([129.153.131.88])
	by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j4L0kh0R019169
	for <nfsv4@ietf.org>; Fri, 20 May 2005 18:46:44 -0600 (MDT)
Received: from conversion-daemon.aus08-mail1.central.sun.com by
	aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	id <0IGT00201D6IYF@aus08-mail1.central.sun.com>
	(original mail from David.Robinson@Sun.COM) for nfsv4@ietf.org; Fri,
	20 May 2005 19:46:43 -0500 (CDT)
Received: from [129.153.128.60] (jetsun.Central.Sun.COM [129.153.128.60])
	by aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	with ESMTP id <0IGT0045CE5T2L@aus08-mail1.central.sun.com> for
	nfsv4@ietf.org; Fri, 20 May 2005 19:46:43 -0500 (CDT)
Date: Fri, 20 May 2005 19:46:41 -0500
From: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
In-reply-to: <C98692FD98048C41885E0B0FACD9DFB8BBBE5E@exnane01.hq.netapp.com>
To: nfsv4@ietf.org
Message-id: <428E84F1.10207@sun.com>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: 7BIT
X-Accept-Language: en-us, en
User-Agent: Mozilla Thunderbird 1.0 (X11/20050129)
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE5E@exnane01.hq.netapp.com>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a92270ba83d7ead10c5001bb42ec3221
Content-Transfer-Encoding: 7BIT
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Noveck, Dave wrote:

> The spec gives no constraints on this.  A server may drop the
> owner state immediately after the last close of an open for this
> owner if it wants.  The only consequence of doing that would be
> a lot of extra OPEN_CONFIRM's and thus bad performance.  It would
> be legal however.  
> 
> In this case, we are talking about, where the hypothesis is that 
> the server does indeed have information about the openowner, it 
> is clear this is not the "first time" the server is seeing that
> owner and the server knows that.  The only reason that this is 
> not illegal is that judgments of spec legality are subject to 
> an epistemological constraint.  They may not refer to details
> of the internals of the server, even though we may very well
> know about them.  If something is illegal it must determinable
> from behavior visible over the wire.  But note that I am not
> saying this is illegal.  I'm just saying it seems bad to me
> and I think I may take into account what we know about the 
> actual situation when we discuss those questions.

As the specification is written, it is perfectly legal to return
either NFS4ERR_BADSEQID or OPEN4_RESULT_CONFIRM. The client must
be prepared to handle either. I am not advocating that either
method is illegal, just that OPEN4_RESULT_CONFIRM would be
preferred.

>>And how is a client expected
>>to be able to reliably determine that fact?

The spec's use of 'first_time' may lead the reader to
expect that the client somehow tracks the server's notion
of 'first time', that doesn't seem reasonable nor reliable
to do. The whole issue of 'first_time' is simply an optimization
by the server to minimize OPEN_CONFIRM traffic. The minimal
required functionality is to OPEN4_RESULT_CONFIRM if there
is no active state.

My argument is that a server that cannot perform the optimization
because the seqid is not the correct next seqid is used, it should
should act as it were the starting from scratch and had not kept
track of the last seqid and reply with OPEN4_RESULT_CONFIRM.

The difference is that in one case the server replaces the notion
of the next valid seqid and in the other the client must create
a new unique open_owner. This would seem to be easier for the
server as it is not leaving behind 'old' open_owners to track
and better for the client as it is one less round trip by
saving the OPEN with new open_owner request.

[ignoring my bad and withdrawn example]


>>So I would like Door #2 Monty...

> Is is legal.  You may go through the door.  If I say overly harsh
> things about your choice, don't take it personally.  I just don't
> think it is the right choice, even though I agree it is legal.

We agree that both are legal. I just find that if the server is
nicer the client has less work to do. Asking for an OPEN_CONFIRM
feels friendlier than issuing an NFS4ERR_BADSEQID.

>>Reasonable behavior should be:

> We need to separate what someone thinks is reasonable from what
> is in RFC3530.  If you think the spec should have allowed clients
> to forget state at will, then that is one thing.  But I don't think 
> the spec actually did that.

I believe that the spec is agnostic to what the client and server
do with respect to maintaining open_owner seqid numbers when
there is no active state. The server can maintain nothing
always requiring an OPEN_CONFIRM and the client can use
any value but must be willing to accept a NFS4ERR_BADSEQID
error.

>>1) Client MUST use the correct next seqid if there is active state
>>   for that lock_owner.
>>2) Client SHOULD use the correct next seqid if there is no
>>   active state for that lock_owner.

> Strictly speaking, the client may send any seqid he wants but if it
> is not the correct one, he has to be prepared for BADSEQID.  If there
> is open state and he uses an incorrect seqid, then he may rely on 
> getting BADSEQID.  If there is no open, he may not rely on getting that
> error, since the server may have forgotten the owner state.

Agreed.

>>4) The server SHOULD allow the OPEN if the lock_owner has the
>>   correct seqid and there is no active state for that lock_owner.

> He must process the open normally.  Whether it allowed depends on
> lots of stuff.

Agreed, "allowed" is dependant on all the other stuff involved
in OPEN.  This case is the strong SHOULD that a server ought to
implement the optimization eliminating OPEN_CONFIRMs.

>>5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
>>   lock_owner has an incorrect seqid and there is no active state
>>   for that lock_owner.

> Why?  If you assert that this is somehow required by RFC3530, then
> where exactly does it say that?  I think it says that if it knows
> a seqid is invalid it should return BADSEQID.  I don't think the fact
> that it might sometimes not know that changes that.  It may be
> that 5) could have been added to the spec without harm, but it
> wasn't.
> 
> As far as the (MUST?) alternative, how can you say you "SHOULD"
> do the open if he has the correct seqid and "MUST" do it if he
> has an incorrect seqid?

The (MUST?) is more of an editorial comment that the RFC doesn't require
this, but if I had thought of this scenerio a few years ago I would
have pushed to make this a MUST. But it is not required anywhere,
just as the server is not required to maintain the last valid seqid
when there is no active state just so it can return BADSEQID.

>>So in normal processing everyone SHOULD make a reasonable effort
>>to maintain enough information to minimize the need to use
>>OPEN_CONFIRM. But in the absence of active state, if one side
>>forgets, it is useful to allow the lock_owner to be reused
>>by sending OPEN4_RESULT_CONFIRM instead of NFS4ERR_BADSEQID.

> I don't see how it is useful.  The client should not reuse the
> same opaque string for different owners and if he forgets the
> seqid, then these are different owners.

A simple client may choose to use a simple value as the open_owner
and want to maintain very minimal state and do something trivial
like restart the seqid at zero if there are no open files.

In the BADSEQID case the client will need to do the following
operations:
	OPEN oo1 seq0 ->
		<- BADSEQID
	OPEN oo2 seq0 ->
		<- OPEN4_RESULT_CONFIRM
	OPEN_CONFIRM ->
		<- OK
This could be even worse if the choice of 'oo2' also conflicted
with some previous open_owner that the server is still tracking
the seqid for, triggering multiple trys by the client to get
a valid open_owner.

If the server is nicer, it can be reduced to:
	OPEN oo1 seq0 ->
		<- OPEN4_RESULT_CONFIRM
	OPEN_CONFIRM ->
		<- OK

The client doesn't need to be as aggressive at creating new
open_owners and the server doesn't have a growing list
of open_owner/seqid pairs to track to optimize a case
that is very unlikely to ever occur. Even a server that
returns BADSEQID ought to just choose to forget that
open_owner as it has effectively just told that client to
never use it again.  So it might as well just allow reuse
by issuing an OPEN4_RESULT_CONFIRM.

> You may think so but the fact is that the spec discusses the
> case of the server forgetting the stateid and never mentions
> the case of the client forgetting it.  If you think that that is
> an oversight, then you need an argument to show why this is a
> necessary case to deal with.  I assert that it is not, since the
> cases of the server and the client are not parallel.  The client
> decides on the lifetime of owners and the server has no way of
> of knowing when the client is done with one.  This requires special
> handling to deal with the case of a server who is forced to drop
> state.  The client doesn't have that problem.  He may drop owners
> at will with no special protocol provision, as long as he doesn't
> try to forget the seqid and then use that same owner string again.

I don't think there is an oversight.  The RFC chose to describe how
a server can be optimal and remember old seqid's when there is no
active state. That does not mean that it is incorrect to be silent
about how a client may or may not reuse an open_owner. Functionally
as written the RFC is complete, I am just proposing that the server
can be friendlier in the case where it chooses to remember old
seqids.

All of this is really picking nits on a very rare edge case. The server
is always correct to return BADSEQID when there is no active
state for an open_owner. But it can be more optimal or friendly.

	-David

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Sat May 21 08:13:48 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZSqq-0004G8-5m; Sat, 21 May 2005 08:13:08 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZSlv-0001XN-KK
	for nfsv4@megatron.ietf.org; Sat, 21 May 2005 08:08:03 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA05860
	for <nfsv4@ietf.org>; Sat, 21 May 2005 07:41:05 -0400 (EDT)
Received: from web30303.mail.mud.yahoo.com ([68.142.200.96])
	by ietf-mx.ietf.org with smtp (Exim 4.33) id 1DZJAB-0004o8-Cw
	for nfsv4@ietf.org; Fri, 20 May 2005 21:52:27 -0400
Received: (qmail 25195 invoked by uid 60001); 21 May 2005 01:34:14 -0000
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	b=z9NFcNGaO29BamJROeWHsP7UR3bfID9101EaJxSrIh2BobJRhQfvPsL+w++EX6Kz4jCREZi8QsMcK4cF83M4AtT7ttqvM/F05SPxQZUenkHtASRwgvv/f/cSG6wvC8xbckmOl4xQXJJT976YqpMrxwkbV7YyuBhpADvOjJukrus=
	; 
Message-ID: <20050521013414.25193.qmail@web30303.mail.mud.yahoo.com>
Received: from [198.95.226.224] by web30303.mail.mud.yahoo.com via HTTP;
	Fri, 20 May 2005 18:34:14 PDT
Date: Fri, 20 May 2005 18:34:14 -0700 (PDT)
From: Mike Eisler <email2mre-ietf@yahoo.com>
Subject: RE: [nfsv4] more re: client re-using lock_owner
To: nfsv4@ietf.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Spam-Score: 0.5 (/)
X-Scan-Signature: 5ebbf074524e58e662bc8209a6235027
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: email2mre-ietf@yahoo.com
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

> I assume this is Mike's obfuscated reply address.
                   ^^^^
                   Eisler

This new address will self destruct once the spammers
find it. (My established correspondents should be able
to use my previous email address, unless they are 
asking me to buy mortgages from Nigerian banks backed by
shareholders who made it big in Viagra sales. Oops,
I've just tripped all your spamassassin filters and you
won't see this).

> email2mre-ietf@yahoo.com wrote:
>  > Here's what bothers me about #2. Let's say reason for
>  > the re-used open_owner is not because the client has
>  > forgotten, but because of a retry of an OPEN,
>  > but the client still remembers the open_owner state.
> 
> If the previous OPEN suceeded the server will have state

Not sure you understand what Trond and I are saying.

Let's say the sequence of events is:

time t: OPEN file 1 (seq 1) -->

   times out/held up somewhere

time t+1: OPEN file 1 [retry] (seq 1) -->

time t+2                     <-- OPEN resp from time t

time t+3: client requences response sent a t+1.

time t+4: OPEN_CONFIRM (seq 2) -->

time t+5:                    <-- OPEN_CONFIRM resp

time t+6: OPEN file2(seq 3) -->

...

time t+n ... time t+n+m (all files for the open_owner closed, all locks released)

sequence number for open_owner is now 1000

time t+n+m+1: OPEN retry from t+1 OPEN retry from t+1 finally reaches server.

          at same time:

time t+n+m+2: OPEN file100 seq (1000) -->

The client, at t+n+m+2 thinks his seq is 1000. If the
server implements door #2, then the open for file100 gets BADSEQ,
because the OPEN at t+n+m+1 is accepted without a BADSEQ.
I believe this will confuse most clients, and perhaps all of
the clients that show up at bakeathons.

>  > If we return NFS4ERR_BADSEQID,
>  > then this doesn't perturb the existing sequence number state
>  > for the open_owner. If we request open confirmation, then
>  > the server has two choices:
> 
>  > 1. Perturb the state. So if the client in fact had not forgotten
>  >    about the open_owner's previous sequence number, the next time
>  >    client goes to use a sequence number from the previous use of
>  >    the open_owner, he gets NFS4ERR_BADSEQID. And gets very
>  >    confused.
> 
> This would only happen if we have a buggy client. If the server
> requests an open_confirm he is telling the client that they

I should have said if the server "requests open_confirm and resets the sequence
number to 2"

> are establishing new state. If a client holds onto
> its old state  after getting a OPEN4_RESULT_CONFIRM it is
> just broken.

This was a retried OPEN that the client has long since
forgotten about, because he has long since achieved success
with that operation.

The client doesn't care if the server is requesting OPEN_CONFIRM;
he's going to drop the OPEN response. He's going to drop it because
the xid of that response does not correspond to any outstanding
request.

The problem is that short of an infinite duplicate request
cache, the server cannot disinguish (1) retried OPENs from (2)
a reset of a sequence of sequence numbers from a 
new OPEN caused by a client that has forgotten about his 
unused open_owner. So he has to maintain the multiple
quantum-mechanics-like states I mentioned, until the next operation.
If the next operation is OPEN_CONFIRM with seq #2 and the matching stateid X,
then the other state associated with sequence #1000 is disposed of. If the
next operation is OPEN with sequence #1000, then the state with sequence#2
is disposed off. So implementing door #2, requires Schrodinger states.

(The point of all this sequence number stuff we (you actually :-) added
to NFSv4 was to dispense with the dup request cache for nasty
non-idempotent operations like open and locks.)

What is missing here is a RELEASE_OPENOWNER operation. With that,
the issue of clients forgetting about unused OPEN_OWNERS would be moot.

The other thing is that OPEN_CONFIRM is a kludge 
(courtesy me), but it was put it save a separate round 
trip to establish an open_owner. Given the separation
between open_owners and lock_owners a separate operation 
for creating open_owners might not be a bad thing. But anyway, 
draft-ietf-nfsv4-sess-01.txt kills OPEN_CONFIRM.


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Sat May 21 10:19:41 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZUpJ-0004OZ-Dn; Sat, 21 May 2005 10:19:41 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZUpH-0004OP-Vi
	for nfsv4@megatron.ietf.org; Sat, 21 May 2005 10:19:40 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA24291
	for <nfsv4@ietf.org>; Sat, 21 May 2005 10:19:37 -0400 (EDT)
Received: from mx1.netapp.com ([216.240.18.38])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZV6r-0006aT-RN
	for nfsv4@ietf.org; Sat, 21 May 2005 10:37:50 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124)
	by mx1.netapp.com with ESMTP; 21 May 2005 07:19:30 -0700
X-IronPort-AV: i="3.93,125,1115017200"; 
	d="scan'208"; a="172396768:sNHT23807704"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com
	[10.57.156.149])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4LEJU76027099; Sat, 21 May 2005 07:19:30 -0700 (PDT)
Received: from lavender.hq.netapp.com ([10.56.11.75]) by
	svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); 
	Sat, 21 May 2005 07:19:29 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by lavender.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Sat, 21 May 2005 07:19:29 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: client re-using lock_owner
Date: Sat, 21 May 2005 10:19:28 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE60@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: client re-using lock_owner
Thread-Index: AcVd/szzZ6FeUokzTbSA63eKIoTVLAABS/hg
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: <email2mre-ietf@yahoo.com>, <nfsv4@ietf.org>
X-OriginalArrivalTime: 21 May 2005 14:19:29.0862 (UTC)
	FILETIME=[19FFF660:01C55E10]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 86f85b2f88b0d50615aed44a7f9e33c7
Content-Transfer-Encoding: quoted-printable
Cc: 
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Even supposing that a server could have complicated
Schrodinger logic that works, I don't think this idea
(client resetting seqids when there is no open state)
would work.

Suppose a client, with a given lockowner opens a file,
does a little stuff and closes with the owner at the=20
close being at seqid 3.

Now suppose that later the client sends seqid 0 (same
owner string which is sent as part of a "new" openowner
and the server accepts it and ask for an open confirm and
it is duly sent with seqid 1.

Now you receive seqid 2 and the question arises whether it
is part of the "new" openowner sequence or is some that has
just been disgorged by a router.  The server can't tell so
if he gets seqid 2 he will execute it, correctly or not.

The issue here is that the spec requires ascending seqid's
for a given owner for a reason.  If they can be reset to zero
then you can have different messages with the same seqid/
owner-string/clientid triple and that is *bad*.

If a client tries to reset the seqid, the result shoud be
BADSEQID.  Under some circumstances, the server may be=20
unable to detect this situation (dropping apparently unused
state), and so the spec cannot require this in all situations,
but it is something the server should try to do and it=20
certainly should do that when it has the information available,
as in the case we are talking about.

If the client resets the seqid he has to expect BADSEQID.  If
the server is unable to make the check, then OPEN_CONFIRM is
used to determine whether the potentially bad seqid received is
OK or not.  If the client's response is to send OPEN_CONFIRM,
then he is saying that the seqid for the open is the correct one,
i.e. it is in the correct sequence.  If he does that when it=20
isn't, then he is deliberately violating the protocol and the
server will not be able to reliably detect replays.


Mike Eisler wrote:
> The other thing is that OPEN_CONFIRM is a kludge=20
> (courtesy me), but it was put it save a separate round=20
> trip to establish an open_owner. Given the separation
> between open_owners and lock_owners a separate operation=20
> for creating open_owners might not be a bad thing. But anyway,=20
> draft-ietf-nfsv4-sess-01.txt kills OPEN_CONFIRM.

It's a brief for the prosecution.  The v4.1 spec will be the=20
death sentence, but since sessions are optional, we will have
to wait at least until v4.2 to actually kill it.  Call me
bloodthirsty but I'd like to get on with it.


-----Original Message-----
From: Mike Eisler [mailto:email2mre-ietf@yahoo.com]
Sent: Friday, May 20, 2005 9:34 PM
To: nfsv4@ietf.org
Subject: RE: [nfsv4] more re: client re-using lock_owner


> I assume this is Mike's obfuscated reply address.
                   ^^^^
                   Eisler

This new address will self destruct once the spammers
find it. (My established correspondents should be able
to use my previous email address, unless they are=20
asking me to buy mortgages from Nigerian banks backed by
shareholders who made it big in Viagra sales. Oops,
I've just tripped all your spamassassin filters and you
won't see this).

> email2mre-ietf@yahoo.com wrote:
>  > Here's what bothers me about #2. Let's say reason for
>  > the re-used open_owner is not because the client has
>  > forgotten, but because of a retry of an OPEN,
>  > but the client still remembers the open_owner state.
>=20
> If the previous OPEN suceeded the server will have state

Not sure you understand what Trond and I are saying.

Let's say the sequence of events is:

time t: OPEN file 1 (seq 1) -->

   times out/held up somewhere

time t+1: OPEN file 1 [retry] (seq 1) -->

time t+2                     <-- OPEN resp from time t

time t+3: client requences response sent a t+1.

time t+4: OPEN_CONFIRM (seq 2) -->

time t+5:                    <-- OPEN_CONFIRM resp

time t+6: OPEN file2(seq 3) -->

...

time t+n ... time t+n+m (all files for the open_owner closed, all locks =
released)

sequence number for open_owner is now 1000

time t+n+m+1: OPEN retry from t+1 OPEN retry from t+1 finally reaches =
server.

          at same time:

time t+n+m+2: OPEN file100 seq (1000) -->

The client, at t+n+m+2 thinks his seq is 1000. If the
server implements door #2, then the open for file100 gets BADSEQ,
because the OPEN at t+n+m+1 is accepted without a BADSEQ.
I believe this will confuse most clients, and perhaps all of
the clients that show up at bakeathons.

>  > If we return NFS4ERR_BADSEQID,
>  > then this doesn't perturb the existing sequence number state
>  > for the open_owner. If we request open confirmation, then
>  > the server has two choices:
>=20
>  > 1. Perturb the state. So if the client in fact had not forgotten
>  >    about the open_owner's previous sequence number, the next time
>  >    client goes to use a sequence number from the previous use of
>  >    the open_owner, he gets NFS4ERR_BADSEQID. And gets very
>  >    confused.
>=20
> This would only happen if we have a buggy client. If the server
> requests an open_confirm he is telling the client that they

I should have said if the server "requests open_confirm and resets the =
sequence
number to 2"

> are establishing new state. If a client holds onto
> its old state  after getting a OPEN4_RESULT_CONFIRM it is
> just broken.

This was a retried OPEN that the client has long since
forgotten about, because he has long since achieved success
with that operation.

The client doesn't care if the server is requesting OPEN_CONFIRM;
he's going to drop the OPEN response. He's going to drop it because
the xid of that response does not correspond to any outstanding
request.

The problem is that short of an infinite duplicate request
cache, the server cannot disinguish (1) retried OPENs from (2)
a reset of a sequence of sequence numbers from a=20
new OPEN caused by a client that has forgotten about his=20
unused open_owner. So he has to maintain the multiple
quantum-mechanics-like states I mentioned, until the next operation.
If the next operation is OPEN_CONFIRM with seq #2 and the matching =
stateid X,
then the other state associated with sequence #1000 is disposed of. If =
the
next operation is OPEN with sequence #1000, then the state with =
sequence#2
is disposed off. So implementing door #2, requires Schrodinger states.

(The point of all this sequence number stuff we (you actually :-) added
to NFSv4 was to dispense with the dup request cache for nasty
non-idempotent operations like open and locks.)

What is missing here is a RELEASE_OPENOWNER operation. With that,
the issue of clients forgetting about unused OPEN_OWNERS would be moot.

The other thing is that OPEN_CONFIRM is a kludge=20
(courtesy me), but it was put it save a separate round=20
trip to establish an open_owner. Given the separation
between open_owners and lock_owners a separate operation=20
for creating open_owners might not be a bad thing. But anyway,=20
draft-ietf-nfsv4-sess-01.txt kills OPEN_CONFIRM.


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Sun May 22 19:11:44 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DZzbk-0008Ho-QW; Sun, 22 May 2005 19:11:44 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DZzbi-0008Hc-U3
	for nfsv4@megatron.ietf.org; Sun, 22 May 2005 19:11:43 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id TAA05013
	for <nfsv4@ietf.org>; Sun, 22 May 2005 19:11:40 -0400 (EDT)
Received: from nwkea-mail-1.sun.com ([192.18.42.13])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DZzta-0008F9-AU
	for nfsv4@ietf.org; Sun, 22 May 2005 19:30:10 -0400
Received: from phys-aus08-1 ([129.153.131.88])
	by nwkea-mail-1.sun.com (8.12.10/8.12.9) with ESMTP id j4MNBeME000749
	for <nfsv4@ietf.org>; Sun, 22 May 2005 16:11:40 -0700 (PDT)
Received: from conversion-daemon.aus08-mail1.central.sun.com by
	aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	id <0IGW00A01XPY6A@aus08-mail1.central.sun.com>
	(original mail from David.Robinson@Sun.COM) for nfsv4@ietf.org; Sun,
	22 May 2005 18:11:40 -0500 (CDT)
Received: from [192.168.0.2]
	(vpn-129-153-214-73.Central.Sun.COM [129.153.214.73])
	by aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	with ESMTP id <0IGW002Z6Z3GQX@aus08-mail1.central.sun.com>; Sun,
	22 May 2005 18:11:40 -0500 (CDT)
Date: Sun, 22 May 2005 18:11:39 -0500
From: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
In-reply-to: <1116645740.15684.84.camel@lade.trondhjem.org>
To: nfsv4@ietf.org
Message-id: <bcd09c352f6b3f6d614da56e4d7a71ed@sun.com>
MIME-version: 1.0 (Apple Message framework v622)
X-Mailer: Apple Mail (2.622)
Content-type: text/plain; charset=US-ASCII; format=flowed
Content-transfer-encoding: 7BIT
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
	<428E782E.40304@sun.com> <1116645740.15684.84.camel@lade.trondhjem.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 68ba2b07ef271dba6ee42a93832cfa4c
Content-Transfer-Encoding: 7BIT
Cc: rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

On May 20, 2005, at 10:22 PM, Trond Myklebust wrote:
>> If there is a replay, then it is presumed that the first OPEN
>> succeeded and there is active state which means rule #5
>> doesn't come into play.  It is rule #3 and NFS4ERR_BADSEQID as
>> some servers are already doing today. If the initial request
>> never actually made it to the server, then there is no active
>> state and client never saw a reply, so everything just
>> works as if it were the first request.
>
> Why? What is stopping something like the following from happening:
>
> <connection 1>
> OPEN
> ||||
> <network partition or server thread hangs>
> 			<client initiates new connection>
> 					OPEN
> 					CLOSE
>
> 					OPEN
> <network partition/hang clears>
> ||||
> first OPEN is processed on server
>
> You lose all ordering guarantees when the client is expected to
> reconnect every time it replays a request.

Having the server request an OPEN_CONFIRM vs BADSEQID doesn't
change the replay semantics.  You didn't put numbers in your example
so lets use more details:

1) Client sends OPEN oo1 seq1
							it gets delayed a long time
2) Client replays OPEN oo1 seq1 on new connection
3) Server performs OPEN now expecting seq2
4) Client sends CLOSE seq2
5) Server performs CLOSE and now has no active state

If the client's original OPEN now gets unstuck and gets
presented to the server, what happens?

The server can legally be on one of two conditions, it either
remembers that it is expecting seq3 or may not have remembered
or had its internal variables garbaged collected.

If the server returns BADSEQID (which it legally can) the
client will see that but it also will know that it is
for an RPC that it has already discarded when it issued the
replay, the XID is what governs this. A client replaying
a request on a different connection with the same XID
is just a broken RPC client. XID reuse can only be done
for transport level retransmits. This is independent of
a duplicate request cache being used or not.

If the server returns a request for OPEN_CONFIRM the client
is in the same condition as above. It has already disposed
of the initial OPEN and will fail to OPEN_CONFIRM ending
the chain. This is not unlike an impatient client abandoning
a normal first OPEN before the server responds.

In your example, it also appears to have the condition
where after the CLOSE a new OPEN has succeeded on the
new connection before the original OPEN gets processed.
In this case the server MUST return BADSEQID because it
has active state and it is an old value. If I recall the
original discussions on this, it was proposed that the
server simply drop the request as the client has clearly
already moved past the original request. But it was decided
to still respond in case some client could use this information
in some way outside the required behavior (a clue that there
may be bad hardware in the network?).

>>> AFAICS, this would appear to be basically the same problem that TCP 
>>> has
>>> on connections to the same port. Once you close the connection, then 
>>> you
>>> need a suitable moratorium period in order to allow replays to die 
>>> out
>>> before you allow a new connection. The length of that moratorium,
>>> however, is more of a "best practices" issue rather than a protocol
>>> issue since it depends on the nature of the transport used (10GigE,
>>> 10Mbit, multipathing, really old and stupid routers, etc.).
>>
>> But in TCP we have only the sequence number to defend against replays,
>> in NFS we have the RPC level XID and the open_owner seqid space
>> to defend against wayward packets (also it would have to be on a new
>> TCP connection).
>
> NFSv4 does _not_ allow you to rely on XIDs. (See the sections listed
> above.) Rather, the rules for ensuring only-once semantics are given by
> the paragraph on Page 73 that says:
>
>         If a request (r) with a previous sequence number (r < L) is
>         received, it is rejected with the return of error
>         NFS4ERR_BAD_SEQID.  Given a properly-functioning client, the
>         response to (r) must have been received before the last request
>         (L) was sent. If a duplicate of last request (r == L) is
>         received, the stored response is returned. If a request beyond
>         the next sequence (r == L + 2) is received, it is rejected with
>         the return of error NFS4ERR_BAD_SEQID.
>
> I can see no mention anywhere in the RFC of the word "XID", nor do I 
> see
> anything that would allow the server to assume it can override the 
> above
> rule should the client resend an RPC request using a different XID or 
> on
> a different TCP connection.

The XID is a bit of a red herring, but it is still true that NFS
has more information and state than TCP so it is not a correct
analogy to compare the operation with that of TCP port reuse
moritoriums. In TCP it must maintain the old sequence number
because if it doesn't it has no way to detect if the packet
is from an old connection. With NFS level OPENs there are two
states that the server may be in, it may either have active
state on the openowner or it is maintaining the last seqid
as an optimization, in this case it can defend against replays.
The other case is if it has never seen an openowner or has
garbage collected its state, it request an OPEN_CONFIRM
effectively asking the client 'are you sure'? The lack
of confirming is what requires TCP to have the moritorium.

There is no weakening of replay prevention by having
the server request an OPEN_CONFIRM when there is no
active state instead of just issuing a BADSEQID. In fact
it is exactly the same as a replay that arrives after
a server garbage collects any cached open owner state.
The advantage is that it lessens the requirement that
the client not reuse open_owners and takes out one
round trip if it does reuse.

	-David


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 23 12:36:11 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DaFuV-000238-HS; Mon, 23 May 2005 12:36:11 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DaFuT-00022L-Gs
	for nfsv4@megatron.ietf.org; Mon, 23 May 2005 12:36:09 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA11984
	for <nfsv4@ietf.org>; Mon, 23 May 2005 12:36:07 -0400 (EDT)
Received: from mtagate1.de.ibm.com ([195.212.29.150])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DaGCT-0002j2-75
	for nfsv4@ietf.org; Mon, 23 May 2005 12:54:46 -0400
Received: from d12nrmr1607.megacenter.de.ibm.com
	(d12nrmr1607.megacenter.de.ibm.com [9.149.167.49])
	by mtagate1.de.ibm.com (8.12.10/8.12.10) with ESMTP id j4NGZwkn123110
	for <nfsv4@ietf.org>; Mon, 23 May 2005 16:35:58 GMT
Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com
	[9.149.165.229])
	by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id
	j4NGZw4Y084950 for <nfsv4@ietf.org>; Mon, 23 May 2005 18:35:58 +0200
Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1])
	by d12av04.megacenter.de.ibm.com (8.12.11/8.13.3) with ESMTP id
	j4NGZwTM015419 for <nfsv4@ietf.org>; Mon, 23 May 2005 18:35:58 +0200
Received: from d12ml102.megacenter.de.ibm.com (d12ml102.megacenter.de.ibm.com
	[9.149.166.138])
	by d12av04.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id
	j4NGZwrJ015414 for <nfsv4@ietf.org>; Mon, 23 May 2005 18:35:58 +0200
From: Ohad Rodeh <ORODEH@il.ibm.com>
To: nfsv4@ietf.org
Message-ID: <OFC03B3DD4.4202A72A-ONC225700A.005B2E77-C225700A.005B2E77@il.ibm.com>
Date: Mon, 23 May 2005 19:35:56 +0300
X-MIMETrack: Serialize by Router on D12ML102/12/M/IBM(Release 6.5.1| March 5,
	2004) at 23/05/2005 19:35:57
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 30ac594df0e66ffa5a93eb4c48bcb014
Subject: [nfsv4] Ohad Rodeh/Haifa/IBM is out of the office.
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org


I will be out of the office starting  22/05/2005 and will not return until
03/06/2005.

I will respond to your message when I return.


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 23 13:21:24 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DaGcF-00071W-Vj; Mon, 23 May 2005 13:21:23 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DaGcE-00071R-JV
	for nfsv4@megatron.ietf.org; Mon, 23 May 2005 13:21:22 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id NAA15929
	for <nfsv4@ietf.org>; Mon, 23 May 2005 13:21:19 -0400 (EDT)
Received: from pat.uio.no ([129.240.130.16] ident=7411)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DaGuF-0004ac-M9
	for nfsv4@ietf.org; Mon, 23 May 2005 13:40:00 -0400
Received: from mail-mx4.uio.no ([129.240.10.45])
	by pat.uio.no with esmtp (Exim 4.43)
	id 1DaGc7-0003La-KM; Mon, 23 May 2005 19:21:15 +0200
Received: from dh138.citi.umich.edu ([141.211.133.138])
	by mail-mx4.uio.no with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.43)
	id 1DaGc2-0006QG-N0; Mon, 23 May 2005 19:21:11 +0200
Subject: Re: [nfsv4] more re: client re-using lock_owner
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: David Robinson <David.Robinson@Sun.COM>
In-Reply-To: <bcd09c352f6b3f6d614da56e4d7a71ed@sun.com>
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
	<428E782E.40304@sun.com> <1116645740.15684.84.camel@lade.trondhjem.org>
	<bcd09c352f6b3f6d614da56e4d7a71ed@sun.com>
Content-Type: text/plain
Date: Mon, 23 May 2005 13:21:06 -0400
Message-Id: <1116868866.11483.46.camel@lade.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-UiO-Spam-info: not spam, SpamAssassin (score=-3.595, required 12,
	autolearn=disabled, AWL 1.41, UIO_MAIL_IS_INTERNAL -5.00)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 6ffdee8af20de249c24731d8414917d3
Content-Transfer-Encoding: 7bit
Cc: nfsv4@ietf.org, rick@snowhite.cis.uoguelph.ca
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

su den 22.05.2005 Klokka 18:11 (-0500) skreiv David Robinson:
> > <connection 1>
> > OPEN
> > ||||
> > <network partition or server thread hangs>
> > 			<client initiates new connection>
> > 					OPEN
> > 					CLOSE
> >
> > 					OPEN
> > <network partition/hang clears>
> > ||||
> > first OPEN is processed on server
> >
> > You lose all ordering guarantees when the client is expected to
> > reconnect every time it replays a request.
> 
> Having the server request an OPEN_CONFIRM vs BADSEQID doesn't
> change the replay semantics.  You didn't put numbers in your example
> so lets use more details:
> 
> 1) Client sends OPEN oo1 seq1
> 							it gets delayed a long time
> 2) Client replays OPEN oo1 seq1 on new connection
> 3) Server performs OPEN now expecting seq2
> 4) Client sends CLOSE seq2
> 5) Server performs CLOSE and now has no active state
> 
> If the client's original OPEN now gets unstuck and gets
> presented to the server, what happens?
> 
> The server can legally be on one of two conditions, it either
> remembers that it is expecting seq3 or may not have remembered
> or had its internal variables garbaged collected.
> 
> If the server returns BADSEQID (which it legally can) the
> client will see that but it also will know that it is
> for an RPC that it has already discarded when it issued the
> replay, the XID is what governs this. A client replaying
> a request on a different connection with the same XID
> is just a broken RPC client. XID reuse can only be done
> for transport level retransmits. This is independent of
> a duplicate request cache being used or not.

Yes, however you are assuming that ignoring a reply is always an
acceptable thing to do. In a stateful model, like we have here, the
client may be missing a state change.

> If the server returns a request for OPEN_CONFIRM the client
> is in the same condition as above. It has already disposed
> of the initial OPEN and will fail to OPEN_CONFIRM ending
> the chain. This is not unlike an impatient client abandoning
> a normal first OPEN before the server responds.
> 
> In your example, it also appears to have the condition
> where after the CLOSE a new OPEN has succeeded on the
> new connection before the original OPEN gets processed.
> In this case the server MUST return BADSEQID because it
> has active state and it is an old value. If I recall the
> original discussions on this, it was proposed that the
> server simply drop the request as the client has clearly
> already moved past the original request. But it was decided
> to still respond in case some client could use this information
> in some way outside the required behavior (a clue that there
> may be bad hardware in the network?).

The server is not allowed to return BADSEQID if the sequence number
happened to be correct as in the following race.

<connection 1>
OPEN (seqid=0)
OPEN_CONFIRM(seqid=1)
OPEN (seqid=2)
||||
<network partition or server thread hangs - client gives up on request>
			<client initiates new connection 2>
					OPEN(seqid=2)
					CLOSE(seqid=3)
---
					OPEN (seqid=0)
					OPEN_CONFIRM (seqid=1)

<network partition/hang clears on connection 1>
||||
OPEN (seqid=2) is processed on server

If so, and if the client discards the reply (as it normally would,
because the XID didn't correspond to any outstanding requests) then it
will suddenly find that the server is returning NFS4ERR_OLD_STATEID on
subsequent READs and WRITEs.

Furthermore, the server thinks that the client is holding a share lock
and/or a delegation on the file that the client didn't record because it
discarded the reply. That will lead to a lot of delays when some
conflicting share lock comes in and/or the server tries to recall said
delegation. The share lock conflict may even require administrator
intervention in order to be solved.

There are also other interesting variants on the above scenario of the
form

<connection 1>
OPEN (seqid=0)
OPEN_CONFIRM(seqid=1)
OPEN (seqid=2)
||||
<network partition or server thread hangs - client gives up on request>
			<client initiates new connection 2>
					OPEN(seqid=2)
					CLOSE(seqid=3)
---
					OPEN (seqid=0)
					OPEN_CONFIRM (seqid=1)

<network partition/hang clears on connection 1>
||||
OPEN (seqid=2) is processed on server
					OPEN/CLOSE/LOCK/... (seqid=2)

...where the server sees the last request as a replay of the "invisible"
OPEN. Depending on how it has constructed its DRC, it may replay a bogus
reply.

Cheers,
  Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 23 14:35:32 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DaHlz-0008HW-UZ; Mon, 23 May 2005 14:35:31 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DaHly-0008HF-AR
	for nfsv4@megatron.ietf.org; Mon, 23 May 2005 14:35:30 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA25734
	for <nfsv4@ietf.org>; Mon, 23 May 2005 14:35:29 -0400 (EDT)
Received: from brmea-mail-3.sun.com ([192.18.98.34])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DaI40-0007pr-4e
	for nfsv4@ietf.org; Mon, 23 May 2005 14:54:08 -0400
Received: from phys-aus08-1 ([129.153.131.88])
	by brmea-mail-3.sun.com (8.12.10/8.12.9) with ESMTP id j4NIZSjO017020
	for <nfsv4@ietf.org>; Mon, 23 May 2005 12:35:28 -0600 (MDT)
Received: from conversion-daemon.aus08-mail1.central.sun.com by
	aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	id <0IGY00F01GWJE2@aus08-mail1.central.sun.com>
	(original mail from David.Robinson@Sun.COM) for nfsv4@ietf.org; Mon,
	23 May 2005 13:35:28 -0500 (CDT)
Received: from [129.153.128.60] (jetsun.Central.Sun.COM [129.153.128.60])
	by aus08-mail1.central.sun.com
	(iPlanet Messaging Server 5.2 HotFix 1.24 (built Dec 19 2003))
	with ESMTP id <0IGY00EPOGZ382@aus08-mail1.central.sun.com> for
	nfsv4@ietf.org; Mon, 23 May 2005 13:35:28 -0500 (CDT)
Date: Mon, 23 May 2005 13:35:27 -0500
From: David Robinson <David.Robinson@Sun.COM>
Subject: Re: [nfsv4] more re: client re-using lock_owner
In-reply-to: <1116868866.11483.46.camel@lade.trondhjem.org>
To: nfsv4@ietf.org
Message-id: <4292226F.4090308@sun.com>
MIME-version: 1.0
Content-type: text/plain; charset=ISO-8859-1; format=flowed
Content-transfer-encoding: 7BIT
X-Accept-Language: en-us, en
User-Agent: Mozilla Thunderbird 1.0 (X11/20050129)
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
	<428E782E.40304@sun.com> <1116645740.15684.84.camel@lade.trondhjem.org>
	<bcd09c352f6b3f6d614da56e4d7a71ed@sun.com>
	<1116868866.11483.46.camel@lade.trondhjem.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 00e94c813bef7832af255170dca19e36
Content-Transfer-Encoding: 7BIT
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

Trond Myklebust wrote:
> su den 22.05.2005 Klokka 18:11 (-0500) skreiv David Robinson:

>>If the server returns BADSEQID (which it legally can) the
>>client will see that but it also will know that it is
>>for an RPC that it has already discarded when it issued the
>>replay, the XID is what governs this. A client replaying
>>a request on a different connection with the same XID
>>is just a broken RPC client. XID reuse can only be done
>>for transport level retransmits. This is independent of
>>a duplicate request cache being used or not.

> Yes, however you are assuming that ignoring a reply is always an
> acceptable thing to do. In a stateful model, like we have here, the
> client may be missing a state change.

But this is a fundamental of an RPC, replies need to be processed
in the context of the caller. The the caller has abandoned
the original request when it issues the replay it has no
way to match up the now stale original reply. In theory
a caller could maintain state for all requests ever sent but
not replied to in order to do cleanup, but I don't know
any RPC implementations that do this. So the stale reply will
just get dropped. More likely a client will drop the connection
before issuing the replay so it doesn't have to worry about
stale replies.

> The server is not allowed to return BADSEQID if the sequence number
> happened to be correct as in the following race.

But this race exists today in existing implemenations.  Simply insert
a sufficently long pause after the CLOSE(seqid=3) so that the server
has garbaged collected any old open_owner state as described in 8.1.7
and this exact scenerio will happen. The server will not send
BADSEQID because it is now a "first time" OPEN.

> <connection 1>
> OPEN (seqid=0)
> OPEN_CONFIRM(seqid=1)
> OPEN (seqid=2)
> ||||
> <network partition or server thread hangs - client gives up on request>
> 			<client initiates new connection 2>
> 					OPEN(seqid=2)
> 					CLOSE(seqid=3)
> ---
> 					OPEN (seqid=0)
> 					OPEN_CONFIRM (seqid=1)
> 
> <network partition/hang clears on connection 1>
> ||||
> OPEN (seqid=2) is processed on server
> 
> If so, and if the client discards the reply (as it normally would,
> because the XID didn't correspond to any outstanding requests) then it
> will suddenly find that the server is returning NFS4ERR_OLD_STATEID on
> subsequent READs and WRITEs.

This could also happen if the time is long enough that the seqid
wraps before the original OPEN gets processed. The only real
defense here is that a client needs to track if it has replayed
an OPEN and on last close abandon that open_owner for as long as
possible. Again, this problem exists today if the server garbage
collects its open_owner state. Clients must be very careful about
issuing replays, I don't think it is right to ask the server to
maintain perpetual state to cover all the edge cases that are
better elimimated by the client being smarter on replays.

But this is not the case that we originally started talking about,
replays are rare. What should a server respond with when it gets
an OPEN with an out of range seqid and there is no active state
on that open_owner. The server can, and will, respond back with
either a request for an OPEN_CONFIRM or a BADSEQID. I still
contend that it is nicer for the server to request the OPEN_CONFIRM.

	-David

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 23 15:08:46 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DaIIA-0005ba-8a; Mon, 23 May 2005 15:08:46 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DaII8-0005bU-Tz
	for nfsv4@megatron.ietf.org; Mon, 23 May 2005 15:08:45 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA29337
	for <nfsv4@ietf.org>; Mon, 23 May 2005 15:08:43 -0400 (EDT)
Received: from pat.uio.no ([129.240.130.16] ident=7411)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DaIa8-0000iW-11
	for nfsv4@ietf.org; Mon, 23 May 2005 15:27:23 -0400
Received: from mail-mx2.uio.no ([129.240.10.30])
	by pat.uio.no with esmtp (Exim 4.43)
	id 1DaII1-0005ZG-S6; Mon, 23 May 2005 21:08:37 +0200
Received: from dh138.citi.umich.edu ([141.211.133.138])
	by mail-mx2.uio.no with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.43)
	id 1DaIHv-00058V-Oq; Mon, 23 May 2005 21:08:32 +0200
Subject: Re: [nfsv4] more re: client re-using lock_owner
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: David Robinson <David.Robinson@Sun.COM>
In-Reply-To: <4292226F.4090308@sun.com>
References: <C98692FD98048C41885E0B0FACD9DFB8BBBE55@exnane01.hq.netapp.com>
	<428E23A0.9090300@sun.com> <20050520180139.GA2423@fieldses.org>
	<428E782E.40304@sun.com> <1116645740.15684.84.camel@lade.trondhjem.org>
	<bcd09c352f6b3f6d614da56e4d7a71ed@sun.com>
	<1116868866.11483.46.camel@lade.trondhjem.org>
	<4292226F.4090308@sun.com>
Content-Type: text/plain; charset=UTF-8
Date: Mon, 23 May 2005 15:08:29 -0400
Message-Id: <1116875309.11483.88.camel@lade.trondhjem.org>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: quoted-printable
X-UiO-Spam-info: not spam, SpamAssassin (score=-3.818, required 12,
	autolearn=disabled, AWL 1.18, UIO_MAIL_IS_INTERNAL -5.00)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 4adaf050708fb13be3316a9eee889caa
Content-Transfer-Encoding: quoted-printable
Cc: nfsv4@ietf.org
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

m=C3=A5 den 23.05.2005 Klokka 13:35 (-0500) skreiv David Robinson:

> But this race exists today in existing implemenations.  Simply insert
> a sufficently long pause after the CLOSE(seqid=3D3) so that the server
> has garbaged collected any old open_owner state as described in 8.1.7
> and this exact scenerio will happen. The server will not send
> BADSEQID because it is now a "first time" OPEN.

No! That race does not currently exist, since existing clients are _not_
assuming that they can reuse open owners.

At best, the current scenario leads to a hanging OPEN(seqid=3D=3D0) (which
does not involve any state being set up until an OPEN_CONFIRM has been
issued). Replaying the last CLOSE(seqid=3D=3DXXX) should not normally cause
state to be set up again, nor will replaying Some_OPERATION(seqid=3D=3DYYY =
<
XXX) =3D> BADSEQID.

> This could also happen if the time is long enough that the seqid
> wraps before the original OPEN gets processed. The only real
> defense here is that a client needs to track if it has replayed
> an OPEN and on last close abandon that open_owner for as long as
> possible. Again, this problem exists today if the server garbage
> collects its open_owner state. Clients must be very careful about
> issuing replays, I don't think it is right to ask the server to
> maintain perpetual state to cover all the edge cases that are
> better elimimated by the client being smarter on replays.

Right, and this is why adding a new rule to RFC3530 that states that
open owners can be reused by clients would be bad. To do so increases
the number of possible race scenarios for no gain in functionality.

> But this is not the case that we originally started talking about,
> replays are rare. What should a server respond with when it gets
> an OPEN with an out of range seqid and there is no active state
> on that open_owner. The server can, and will, respond back with
> either a request for an OPEN_CONFIRM or a BADSEQID. I still
> contend that it is nicer for the server to request the OPEN_CONFIRM.

By returning BADSEQID, the server is helping client writers to debug
their implementations instead of just giving them more rope. I would
contend that is far more helpful in the long run.

Cheers,
  Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Mon May 23 15:32:49 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DaIfR-0001Cn-5e; Mon, 23 May 2005 15:32:49 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DaIfP-0001Ci-Bi
	for nfsv4@megatron.ietf.org; Mon, 23 May 2005 15:32:47 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA02123
	for <nfsv4@ietf.org>; Mon, 23 May 2005 15:32:45 -0400 (EDT)
Received: from mx1.netapp.com ([216.240.18.38])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DaIxQ-0001PH-LH
	for nfsv4@ietf.org; Mon, 23 May 2005 15:51:25 -0400
Received: from smtp2.corp.netapp.com (10.57.159.114)
	by mx1.netapp.com with ESMTP; 23 May 2005 12:32:34 -0700
X-IronPort-AV: i="3.93,129,1115017200"; 
	d="scan'208"; a="172734742:sNHT26941168"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com
	[10.57.156.149])
	by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4NJWY3G012162; Mon, 23 May 2005 12:32:34 -0700 (PDT)
Received: from burgundy.hq.netapp.com ([10.56.10.66]) by
	svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); 
	Mon, 23 May 2005 12:32:33 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by burgundy.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Mon, 23 May 2005 12:32:33 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Subject: RE: [nfsv4] more re: client re-using lock_owner
Date: Mon, 23 May 2005 15:32:32 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE65@exnane01.hq.netapp.com>
Thread-Topic: [nfsv4] more re: client re-using lock_owner
Thread-Index: AcVd/hmWppcijrheQ4+168MycfzBcQBMenxg
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: "David Robinson" <David.Robinson@Sun.COM>, <nfsv4@ietf.org>
X-OriginalArrivalTime: 23 May 2005 19:32:33.0655 (UTC)
	FILETIME=[2AD6D070:01C55FCE]
X-Spam-Score: 1.2 (+)
X-Scan-Signature: 8136d1e28e0aab8a4e130297ed2e1fc4
Content-Transfer-Encoding: quoted-printable
Cc: 
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

David Robinson wrote:
> > Noveck, Dave wrote:
> > The spec gives no constraints on this.  A server may drop the
> > owner state immediately after the last close of an open for this
> > owner if it wants.  The only consequence of doing that would be
> > a lot of extra OPEN_CONFIRM's and thus bad performance.  It would
> > be legal however. =20
> >=20
> > In this case, we are talking about, where the hypothesis is that=20
> > the server does indeed have information about the openowner, it=20
> > is clear this is not the "first time" the server is seeing that
> > owner and the server knows that.  The only reason that this is=20
> > not illegal is that judgments of spec legality are subject to=20
> > an epistemological constraint.  They may not refer to details
> > of the internals of the server, even though we may very well
> > know about them.  If something is illegal it must determinable
> > from behavior visible over the wire.  But note that I am not
> > saying this is illegal.  I'm just saying it seems bad to me
> > and I think I may take into account what we know about the=20
> > actual situation when we discuss those questions.
>
> As the specification is written, it is perfectly legal to return
> either NFS4ERR_BADSEQID or OPEN4_RESULT_CONFIRM. The client must
> be prepared to handle either. I am not advocating that either
> method is illegal, just that OPEN4_RESULT_CONFIRM would be
> preferred.

I know I said that returning OPEN4_RESULT_CONFIRM is legal, but if
so it is  barely legal.  Perhaps "not provably illegal" would be a=20
better way to summarize my view of this.  I certainly cannot understand
how you could describe this as "preferred".

> >>And how is a client expected
> >>to be able to reliably determine that fact?
>
> The spec's use of 'first_time' may lead the reader to
> expect that the client somehow tracks the server's notion
> of 'first time', that doesn't seem reasonable nor reliable
> to do.=20

The fact that there is no way for the client to do this should
probably suggest to someone interpreting the spec this way, that
he is headed down the wrong path.

> The whole issue of 'first_time' is simply an optimization
> by the server to minimize OPEN_CONFIRM traffic. The minimal
> required functionality is to OPEN4_RESULT_CONFIRM if there
> is no active state.


> My argument is that a server that cannot perform the optimization
> because the seqid is not the correct next seqid is used, it should
> should act as it were the starting from scratch and had not kept
> track of the last seqid and reply with OPEN4_RESULT_CONFIRM.

But why would you do that?

> The difference is that in one case the server replaces the notion
> of the next valid seqid=20

Exactly.  He treats a seqid that he knows to be incorrect as if
it were correct.  And the justification, that there are cases in
which the server might not have the information to validate the
sequence id, and depends on the client to do that, does not justify
the server in acting as if he can't.

The concept of the next valid sequence id is that it is the one
which is one greater than the last valid sequence id.  If you look
at the wire, that is what you should see.  For a given owner, the
id's should increase.  In this case, if the server replaces his
notion of the next valid sequence id with one that is different
from that value, he is doing something wrong.  The only reason it is
approximately legal is that you can't prove, just looking at the
wire that the server has the necessary information.  Since you=20
don't know that the server was not the innocent victim here (and=20
the client just lied to him), you just can't convict.  But it=20
is a far cry from that sort of legality to simply assume that=20
when the server changes his notion of what the next valid seqid,=20
that has any effect of that the real next valid seqid is.  You=20
have a situation in whch the seqid ascends and then descends when
the spec says it should ascend, except for wraparound at 4 billion,
I suppose.

> and in the other the client must create
> a new unique open_owner.=20

He can either create a new open_owner or, if he doesn't want to do
that, he can use an old one but if he uses an old one then the protocol
has a notion of what the next valid seqid is, and the client should
conform to it.

> This would seem to be easier for the
> server as it is not leaving behind 'old' open_owners to track

If the client wants to use a small set of owners and not leave old
ones around, it may do so.  What it may not do is change the next
seqid whenever it feels like, even if it only feels like it when
there is no open state.

> and better for the client as it is one less round trip by
> saving the OPEN with new open_owner request.

If you make a request with an old open_owner and you have no idea
what the next valid seqid is, then it is very likely that you will
get BADSEQID.  My suggestion is, if you don't like the extra round=20
trip, don't do that.  You are in exactly the same position after=20
getting the BADSEQID as you were before.  All the server has done
is obey the protocol, rather than cut you the sort of slack that
is going to do you an injury in the end.

> [ignoring my bad and withdrawn example]


> >>So I would like Door #2 Monty...
>=20
> > Is is legal.  You may go through the door.  If I say overly harsh
> > things about your choice, don't take it personally.  I just don't
> > think it is the right choice, even though I agree it is legal.
>=20
> We agree that both are legal. I just find that if the server is
> nicer the client has less work to do. Asking for an OPEN_CONFIRM
> feels friendlier than issuing an NFS4ERR_BADSEQID.

It is not friendlier to accept what is invalid from the client and
treat it as if it were valid.  Error checking is part of the protocol
too.

Rather than argue about the precise characterization of the server's
behavior on a legality-illegality continuum, I think the critical
issue is the legality of what you assume the client will be doing.

You are supposing that a client will send a request with a seqid=20
that is not valid, and then when the server asks it to confirm that
that is valid (since it finds itself without the necessary information),
the client confirms the one it sent, changing the server's notion of=20
the next valid seqid to one which does not match RFC3530's notion of=20
the next valid seqid, which is the next one in numerical sequence. =20
That is where we have a stark disagreement about what is legal and=20
illegal.

> >>Reasonable behavior should be:
>=20
> > We need to separate what someone thinks is reasonable from what
> > is in RFC3530.  If you think the spec should have allowed clients
> > to forget state at will, then that is one thing.  But I don't think=20
> > the spec actually did that.

> I believe that the spec is agnostic to what the client and server
> do with respect to maintaining open_owner seqid numbers when
> there is no active state.=20

It is not agnostic about what the valid seqid is.  The spec says:

   Locking is different than most NFS operations as it requires "at-
   most-one" semantics that are not provided by ONCRPC.  ONCRPC over a
   reliable transport is not sufficient because a sequence of locking
   requests may span multiple TCP connections.  In the face of
   retransmission or reordering, lock or unlock requests must have a
   well defined and consistent behavior.  To accomplish this, each lock
   request contains a sequence number that is a consecutively increasing
   integer.  Different lock_owners have different sequences.

"that is a consecutively increasing integer" does not sound all that
agnostic to me.  I don't see any license for the client to arbitrarily
reset the value to zero at any time, whether the last operation was a
close of the only current open for that owner or not. =20
=20
> The server can maintain nothing
> always requiring an OPEN_CONFIRM and the client can use
> any value but must be willing to accept a NFS4ERR_BADSEQID
> error.

Correct so far, but what I don't think is valid is for the client to
use an invalid seqid and then when he gets a request to confirm to
then take advantage of the server's (presumed) lack of knowledge and
foist an invalid seqid on him. =20

> >>1) Client MUST use the correct next seqid if there is active state
> >>   for that lock_owner.
> >>2) Client SHOULD use the correct next seqid if there is no
> >>   active state for that lock_owner.
>=20
> > Strictly speaking, the client may send any seqid he wants but if it
> > is not the correct one, he has to be prepared for BADSEQID.  If
there
> > is open state and he uses an incorrect seqid, then he may rely on=20
> > getting BADSEQID.  If there is no open, he may not rely on getting
that
> > error, since the server may have forgotten the owner state.
>=20
> Agreed.
>=20
> >>4) The server SHOULD allow the OPEN if the lock_owner has the
> >>   correct seqid and there is no active state for that lock_owner.
>=20
> > He must process the open normally.  Whether it allowed depends on
> > lots of stuff.

> Agreed, "allowed" is dependant on all the other stuff involved
> in OPEN.  This case is the strong SHOULD that a server ought to
> implement the optimization eliminating OPEN_CONFIRMs.

> >>5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
> >>   lock_owner has an incorrect seqid and there is no active state
> >>   for that lock_owner.

> > Why?  If you assert that this is somehow required by RFC3530, then
> > where exactly does it say that?  I think it says that if it knows
> > a seqid is invalid it should return BADSEQID.  I don't think the
fact
> > that it might sometimes not know that changes that.  It may be
> > that 5) could have been added to the spec without harm, but it
> > wasn't.
> >=20
> > As far as the (MUST?) alternative, how can you say you "SHOULD"
> > do the open if he has the correct seqid and "MUST" do it if he
> > has an incorrect seqid?

> The (MUST?) is more of an editorial comment that the RFC doesn't
require
> this, but if I had thought of this scenerio a few years ago I would
> have pushed to make this a MUST.=20

In which case I would have pushed for a clear statement that if the
server
does have the information necessary to decide the issue of seqid
validity,
it MUST return BADSEQID.

> But it is not required anywhere,
> just as the server is not required to maintain the last valid seqid
> when there is no active state just so it can return BADSEQID.

Well I would contrariwise strongly urge servers to do that, at least
for a considerable time, and particularly so  if there are clients
that feel that they can change the next valid seqid at will, as long
as there is no current open state.

> >>So in normal processing everyone SHOULD make a reasonable effort
> >>to maintain enough information to minimize the need to use
> >>OPEN_CONFIRM. But in the absence of active state, if one side
> >>forgets, it is useful to allow the lock_owner to be reused
> >>by sending OPEN4_RESULT_CONFIRM instead of NFS4ERR_BADSEQID.
>
> > I don't see how it is useful.  The client should not reuse the
> > same opaque string for different owners and if he forgets the
> > seqid, then these are different owners.
>
> A simple client may choose to use a simple value as the open_owner
> and want to maintain very minimal state and do something trivial
> like restart the seqid at zero if there are no open files.

He may want to initially, but as he thinks about the consequences,
I would expect him to decide to do something else, that makes more=20
sense given the protocol.

The spec requires, for a given open_owner, ascending seqids.  There are
two ways, each pretty simple, to do that:

1) Use a fixed or at least only expanding set of owners and reuse the
   existing ones as necessary, and maintain the current next seqid upon
   reuse.

2) Allocate and deallocate owners as you wish but make sure that when
you
   think you are creating a new open_owner, that it really is a new=20
   open_owner (i.e. that the opaque string is different from any seen
   before).  Simply adding an integer or a time to make the string
unique
   is a simple way to do that.

> In the BADSEQID case the client will need to do the following
> operations:
>	OPEN oo1 seq0 ->
>		<- BADSEQID
>	OPEN oo2 seq0 ->
>		<- OPEN4_RESULT_CONFIRM
>	OPEN_CONFIRM ->
>		<- OK
> This could be even worse if the choice of 'oo2' also conflicted
> with some previous open_owner that the server is still tracking
> the seqid for, triggering multiple trys by the client to get
> a valid open_owner.

Exactly.  You have yourself a mess.  What you thought was simpler
is not in fact simpler.  Both 1) and 2) are simpler than that since
you never will have to hunt for a valid owner.  If it hurts when you
do that, don't do that.

> If the server is nicer, it can be reduced to:
> 	OPEN oo1 seq0 ->
>		<- OPEN4_RESULT_CONFIRM
>	OPEN_CONFIRM ->
>		<- OK

Sure, but the client has to be prepared for BADSEQID and code for that
case, so it still isn't as simple as 1) and 2).  If you want a simple
client, then those are two good choices, and they don't require any
special niceness on the part of the server.

There are three consequences of doing this:

1) you have code complexity deal with BADSEQID return, which otherwise
   is just a log-this-and-report-a-serious-error type problem (applies
   no mater what server does).

2) you have to hunt and incur latency tyring to find a seqid which works
   (applies when the server is not "friendly", but note that most
existing
    servers are "friendly" in that regard).

3) you compromise RFC3530 replay detection for locking requests (applies
   to all sorts of servers since owner information may be dropped but
issue
   is more severe with "friendly" servers).

You accept that 1) is the case.

It appears that you accept that 2) is the case, but instead of
concluding=20
that the client should not do this, you want the server to be more
"friendly".

You argue that 3) is not the case, but even putting that aside, I can't=20
see why any client would persist in this approach given 1) and 2).  I
know
you think this is simple but as Einstein (I think) said, "Everything
should=20
be as simple as possible, but not more so".

> The client doesn't need to be as aggressive at creating new
> open_owners and the server doesn't have a growing list
> of open_owner/seqid pairs to track to optimize a case
> that is very unlikely to ever occur.=20

The client does not have to be aggressive at creating new owners.
It can reuse old ones as long as it uses the correct seqid when
it does.  If it doesn't want to bother with that small amount of
storage it can make sure the new ones are unique. =20

> Even a server that
> returns BADSEQID ought to just choose to forget that
> open_owner as it has effectively just told that client to
> never use it again.=20

You don't know that it has told the client anything.  If a client
is behaving then typically BADSEQID will be sent in response to
replays such that the client is not looking for the response.=20

> So it might as well just allow reuse
> by issuing an OPEN4_RESULT_CONFIRM.

Reuse when the seqid is incorrect, is well, incorrect.  When the
client and the server reset the seqid to a value that it had before
you have broken the basic logic for at-most-once semantics, which
depends on there being for every clientid/owner/seqid only one
request with that triple.  If there are two or more instance then
some may be executed more than once and you have at-most-a-
handful-of-times semantics which isn't what the spec calls for.

> You may think so but the fact is that the spec discusses the
> case of the server forgetting the stateid and never mentions
> the case of the client forgetting it.  If you think that that is
> an oversight, then you need an argument to show why this is a
> necessary case to deal with.  I assert that it is not, since the
> cases of the server and the client are not parallel.  The client
> decides on the lifetime of owners and the server has no way of
> of knowing when the client is done with one.  This requires special
> handling to deal with the case of a server who is forced to drop
> state.  The client doesn't have that problem.  He may drop owners
> at will with no special protocol provision, as long as he doesn't
> try to forget the seqid and then use that same owner string again.

> I don't think there is an oversight.  The RFC chose to describe how
> a server can be optimal and remember old seqid's when there is no
> active state. That does not mean that it is incorrect to be silent
> about how a client may or may not reuse an open_owner.=20

I agree that there is no oversight.

I don't think that the spec is incorrect in being silent about
open_owner
reuse.  I'm not saying the spec is incorrect, but I am saying that the
fact the chose to be specific about a server forgetting state and did
not mention anything about a client forgetting state is something you
can draw conclusions from (in part, because we agree that it is not an
oversight).

The spec says that the seqid is an ascending sequence (for a given
owner).

The implication of this is that the one generating requests (i.e. the=20
client) must know the previous seqid so it can generate the next one.

If you think that the spec intends there to be an exception, then there
either,

     Has to be words that say that client is free to change the seqid=20
     value at any point in which there is no open state for that
open_owner.
     (are there any such?).

     It is intended but not stated (i.e it is an oversight, which we
agree=20
     is not the case).

     It is clearly implied because the client can't successfully operate
     without it.  (That is also not the case, although we can imagine=20
     client implementations that will not work, but so what?).

The fact that none of the above aply leads me to conclude that there is=20
no exception.  It is quite possible for a simple client to provide
ascending=20
seqid's for a given owner.


> Functionally
> as written the RFC is complete, I am just proposing that the server
> can be friendlier in the case where it chooses to remember old
> seqids.

I think it is friendlier, when the server is capable of deciding that a=20
given seqid is incorrect, to tell the client.  I don't see that
struggling
to limit error detection to the absolute minimum allowed by the spec is
friendly.

> All of this is really picking nits on a very rare edge case. The
server
> is always correct to return BADSEQID when there is no active
> state for an open_owner. But it can be more optimal or friendly.

But we are also talking about what the client may do and the behavior
that you believe the server should accommodate seems to me to be
clearly wrong.  You say the client should accept BADSEQID but it is=20
not clear exactly how a client might not accept a BADSEQID response=20
if that is what the server sends :-).  The issue is what conclusion
a client implementer mgiht draw from getting BADSEQID.  Either,

     The client should decide that it should try a tiny bit harder
     to generate ascending seqids for a given owner as the spec says=20
     it should.

     The client decides that the server is not sufficiently friendly.

In the latter case, the client's accpetance of the BADSEQID is kind of
grudging.  I don't think that is what RFC3530 intended.

I suppose that there is little point in continuing this discussion,
except to point out that it will be rendered moot by sessions, and=20
that's a good thing.=20

-----Original Message-----
From: David Robinson [mailto:David.Robinson@Sun.COM]
Sent: Friday, May 20, 2005 8:47 PM
To: nfsv4@ietf.org
Subject: Re: [nfsv4] more re: client re-using lock_owner


Noveck, Dave wrote:

> The spec gives no constraints on this.  A server may drop the
> owner state immediately after the last close of an open for this
> owner if it wants.  The only consequence of doing that would be
> a lot of extra OPEN_CONFIRM's and thus bad performance.  It would
> be legal however. =20
>=20
> In this case, we are talking about, where the hypothesis is that=20
> the server does indeed have information about the openowner, it=20
> is clear this is not the "first time" the server is seeing that
> owner and the server knows that.  The only reason that this is=20
> not illegal is that judgments of spec legality are subject to=20
> an epistemological constraint.  They may not refer to details
> of the internals of the server, even though we may very well
> know about them.  If something is illegal it must determinable
> from behavior visible over the wire.  But note that I am not
> saying this is illegal.  I'm just saying it seems bad to me
> and I think I may take into account what we know about the=20
> actual situation when we discuss those questions.

As the specification is written, it is perfectly legal to return
either NFS4ERR_BADSEQID or OPEN4_RESULT_CONFIRM. The client must
be prepared to handle either. I am not advocating that either
method is illegal, just that OPEN4_RESULT_CONFIRM would be
preferred.

>>And how is a client expected
>>to be able to reliably determine that fact?

The spec's use of 'first_time' may lead the reader to
expect that the client somehow tracks the server's notion
of 'first time', that doesn't seem reasonable nor reliable
to do. The whole issue of 'first_time' is simply an optimization
by the server to minimize OPEN_CONFIRM traffic. The minimal
required functionality is to OPEN4_RESULT_CONFIRM if there
is no active state.

My argument is that a server that cannot perform the optimization
because the seqid is not the correct next seqid is used, it should
should act as it were the starting from scratch and had not kept
track of the last seqid and reply with OPEN4_RESULT_CONFIRM.

The difference is that in one case the server replaces the notion
of the next valid seqid and in the other the client must create
a new unique open_owner. This would seem to be easier for the
server as it is not leaving behind 'old' open_owners to track
and better for the client as it is one less round trip by
saving the OPEN with new open_owner request.

[ignoring my bad and withdrawn example]


>>So I would like Door #2 Monty...

> Is is legal.  You may go through the door.  If I say overly harsh
> things about your choice, don't take it personally.  I just don't
> think it is the right choice, even though I agree it is legal.

We agree that both are legal. I just find that if the server is
nicer the client has less work to do. Asking for an OPEN_CONFIRM
feels friendlier than issuing an NFS4ERR_BADSEQID.

>>Reasonable behavior should be:

> We need to separate what someone thinks is reasonable from what
> is in RFC3530.  If you think the spec should have allowed clients
> to forget state at will, then that is one thing.  But I don't think=20
> the spec actually did that.

I believe that the spec is agnostic to what the client and server
do with respect to maintaining open_owner seqid numbers when
there is no active state. The server can maintain nothing
always requiring an OPEN_CONFIRM and the client can use
any value but must be willing to accept a NFS4ERR_BADSEQID
error.

>>1) Client MUST use the correct next seqid if there is active state
>>   for that lock_owner.
>>2) Client SHOULD use the correct next seqid if there is no
>>   active state for that lock_owner.

> Strictly speaking, the client may send any seqid he wants but if it
> is not the correct one, he has to be prepared for BADSEQID.  If there
> is open state and he uses an incorrect seqid, then he may rely on=20
> getting BADSEQID.  If there is no open, he may not rely on getting
that
> error, since the server may have forgotten the owner state.

Agreed.

>>4) The server SHOULD allow the OPEN if the lock_owner has the
>>   correct seqid and there is no active state for that lock_owner.

> He must process the open normally.  Whether it allowed depends on
> lots of stuff.

Agreed, "allowed" is dependant on all the other stuff involved
in OPEN.  This case is the strong SHOULD that a server ought to
implement the optimization eliminating OPEN_CONFIRMs.

>>5) The server SHOULD (MUST?) return OPEN4_RESULT_CONFIRM if the
>>   lock_owner has an incorrect seqid and there is no active state
>>   for that lock_owner.

> Why?  If you assert that this is somehow required by RFC3530, then
> where exactly does it say that?  I think it says that if it knows
> a seqid is invalid it should return BADSEQID.  I don't think the fact
> that it might sometimes not know that changes that.  It may be
> that 5) could have been added to the spec without harm, but it
> wasn't.
>=20
> As far as the (MUST?) alternative, how can you say you "SHOULD"
> do the open if he has the correct seqid and "MUST" do it if he
> has an incorrect seqid?

The (MUST?) is more of an editorial comment that the RFC doesn't require
this, but if I had thought of this scenerio a few years ago I would
have pushed to make this a MUST. But it is not required anywhere,
just as the server is not required to maintain the last valid seqid
when there is no active state just so it can return BADSEQID.

>>So in normal processing everyone SHOULD make a reasonable effort
>>to maintain enough information to minimize the need to use
>>OPEN_CONFIRM. But in the absence of active state, if one side
>>forgets, it is useful to allow the lock_owner to be reused
>>by sending OPEN4_RESULT_CONFIRM instead of NFS4ERR_BADSEQID.

> I don't see how it is useful.  The client should not reuse the
> same opaque string for different owners and if he forgets the
> seqid, then these are different owners.

A simple client may choose to use a simple value as the open_owner
and want to maintain very minimal state and do something trivial
like restart the seqid at zero if there are no open files.

In the BADSEQID case the client will need to do the following
operations:
	OPEN oo1 seq0 ->
		<- BADSEQID
	OPEN oo2 seq0 ->
		<- OPEN4_RESULT_CONFIRM
	OPEN_CONFIRM ->
		<- OK
This could be even worse if the choice of 'oo2' also conflicted
with some previous open_owner that the server is still tracking
the seqid for, triggering multiple trys by the client to get
a valid open_owner.

If the server is nicer, it can be reduced to:
	OPEN oo1 seq0 ->
		<- OPEN4_RESULT_CONFIRM
	OPEN_CONFIRM ->
		<- OK

The client doesn't need to be as aggressive at creating new
open_owners and the server doesn't have a growing list
of open_owner/seqid pairs to track to optimize a case
that is very unlikely to ever occur. Even a server that
returns BADSEQID ought to just choose to forget that
open_owner as it has effectively just told that client to
never use it again.  So it might as well just allow reuse
by issuing an OPEN4_RESULT_CONFIRM.

> You may think so but the fact is that the spec discusses the
> case of the server forgetting the stateid and never mentions
> the case of the client forgetting it.  If you think that that is
> an oversight, then you need an argument to show why this is a
> necessary case to deal with.  I assert that it is not, since the
> cases of the server and the client are not parallel.  The client
> decides on the lifetime of owners and the server has no way of
> of knowing when the client is done with one.  This requires special
> handling to deal with the case of a server who is forced to drop
> state.  The client doesn't have that problem.  He may drop owners
> at will with no special protocol provision, as long as he doesn't
> try to forget the seqid and then use that same owner string again.

I don't think there is an oversight.  The RFC chose to describe how
a server can be optimal and remember old seqid's when there is no
active state. That does not mean that it is incorrect to be silent
about how a client may or may not reuse an open_owner. Functionally
as written the RFC is complete, I am just proposing that the server
can be friendlier in the case where it chooses to remember old
seqids.

All of this is really picking nits on a very rare edge case. The server
is always correct to return BADSEQID when there is no active
state for an open_owner. But it can be more optimal or friendly.

	-David

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 26 00:02:41 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1Db9Zw-0005YF-MZ; Thu, 26 May 2005 00:02:40 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1Db9Zu-0005YA-EU
	for nfsv4@megatron.ietf.org; Thu, 26 May 2005 00:02:38 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id AAA02350
	for <nfsv4@ietf.org>; Thu, 26 May 2005 00:02:35 -0400 (EDT)
Received: from gw-w.panasas.com ([63.80.58.206] helo=medlicott.panasas.com)
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1Db9sP-0005A6-93
	for nfsv4@ietf.org; Thu, 26 May 2005 00:21:46 -0400
Received: from panasas.com (welch@localhost)
	by medlicott.panasas.com (8.11.6/8.11.6) with ESMTP id j4Q42Ye26449
	for <nfsv4@ietf.org>; Wed, 25 May 2005 21:02:35 -0700
Message-Id: <200505260402.j4Q42Ye26449@medlicott.panasas.com>
X-Authentication-Warning: medlicott.panasas.com: welch owned process doing -bs
To: nfsv4@ietf.org
From: Brent Welch <welch@panasas.com>
X-URL: http://www.panasas.com/
X-Face: "HxE|?EnC9fVMV8f70H83&{fgLE.|FZ^$>@Q(yb#N,
	Eh~N]e&]=>r5~UnRml1:4EglY{9B+
	:'wJq$@c_C!l8@<$t,{YUr4K,QJGHSvS~U]H`<+L*x?eGzSk>XH\W:AK\j?@?c1o<k;
	j'Ei/UL)!*0
	ILwSR)J\bc)gjz!rrGQ2#i*f:M:ydhK}jp4dWQW?;0{,#iWrCV$4~%e/3)$1/D
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <26447.1117080154.1@panasas.com>
Date: Wed, 25 May 2005 21:02:34 -0700
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 856eb5f76e7a34990d1d457d8e8e5b7f
Subject: [nfsv4] draft-welch-pnfs-ops-01.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

I have recently put up draft-welch-pnfs-ops-01.txt, which I
had planned to get out just before the last IETF.  This draft
will be the starting point for a new draft we are preparing for
the August IETF based on experience with the NetApp/Sun files-based
prototype for pNFS.  To be clear, the welch-01 draft does *not*
reflect that work.  Mainly I wanted to keep the draft current.
Our plan is to add specifics for an NFSv4-based backend protocol
and file-based layout for the August IETF.  There will be a pnfs
working meeting on June 16 associated with the bake-off in Ann Arbor.

See http://www.ietf.org/internet-drafts/draft-welch-pnfs-ops-01.txt
--
Brent Welch
Software Architect, Panasas Inc
Accelerating Time to Results(tm) with Clustered Storage

www.panasas.com
welch@panasas.com

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Thu May 26 14:54:40 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DbNVA-0001sT-U5; Thu, 26 May 2005 14:54:40 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DbNVA-0001sO-2E
	for nfsv4@megatron.ietf.org; Thu, 26 May 2005 14:54:40 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA21091
	for <nfsv4@ietf.org>; Thu, 26 May 2005 14:54:38 -0400 (EDT)
Received: from mx1.netapp.com ([216.240.18.38])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DbNnm-0000HH-QU
	for nfsv4@ietf.org; Thu, 26 May 2005 15:13:56 -0400
Received: from smtp1.corp.netapp.com (10.57.156.124)
	by mx1.netapp.com with ESMTP; 26 May 2005 11:54:25 -0700
X-IronPort-AV: i="3.93,140,1115017200"; 
	d="scan'208"; a="174434643:sNHT17059660"
Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com
	[10.57.156.149])
	by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id
	j4QIsPY6004236
	for <nfsv4@ietf.org>; Thu, 26 May 2005 11:54:25 -0700 (PDT)
Received: from burgundy.hq.netapp.com ([10.56.10.66]) by
	svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); 
	Thu, 26 May 2005 11:54:25 -0700
Received: from exnane01.hq.netapp.com ([10.97.0.61]) by burgundy.hq.netapp.com
	with Microsoft SMTPSVC(5.0.2195.6713); 
	Thu, 26 May 2005 11:54:25 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Thu, 26 May 2005 14:54:24 -0400
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8BBBE80@exnane01.hq.netapp.com>
Thread-Topic: BAD_SEQID vs OPEN_CONFIRM discussion
Thread-Index: AcViJFWThOJpzlkET/eaARBX3B4tLA==
From: "Noveck, Dave" <Dave.Noveck@netapp.com>
To: <nfsv4@ietf.org>
X-OriginalArrivalTime: 26 May 2005 18:54:25.0502 (UTC)
	FILETIME=[563B9FE0:01C56224]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: f60d0f7806b0c40781eee6b9cd0b2135
Content-Transfer-Encoding: quoted-printable
Subject: [nfsv4] BAD_SEQID vs OPEN_CONFIRM discussion
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org

This is *not* an attempt to re-open the discussion of this issue.  I'm
attempting to fairly summarize the discussion in a small number of=20
paragraphs.  (Spencer asked me to do this.)  My intention is to be fair
to all participants and let me know if you think I haven't been.  On
the other hand, the whole point of this summary is to compress the
discussion into something people will find easy to read so please be
understanding if I haven't done full justice to all your arguments.
It just isn't possible without making this much too big.

Rick Macklem raised the question of the best handling if a server=20
receives an OPEN request with a sequence id which does not match
the one which the server expects.  Clearly the server may return
BAD_SEQID, but is it correct/desirable for the server, in the case in
which there was no current open state for that owner, to instead
accept the OPEN, respond indicating that confirmation is necessary,
and update the current seqid for that owner upon receiving that=20
confirmation?

The question at the heart of this discussion is whether there are=20
situations in which a client may legitimately find itself unable to=20
determine the correct next sequence number for a given owner, making
some help by the server desirable, specifically in the form of
acceptance=20
of a non-matching seqid (subject to confirmation).  Although there
was general agreement that because a server is free to drop an owner
when there is no open state, asking for confirmation (as opposed to
returning BAD_SEQID) was within the spec, beyond that point there was
a sharp division of opinion.

David Robinson argued that accepting the OPEN and asking for
confirmation
was the desirable course and in fact stated that if this had been
brought
up before the spec was finalized he would have pushed to make this
alternate
behavior a "MUST".  His argument was, in part, that simple clients might
forget particular open_owners, and then later create new owners with the

same opaque string and a new seqid sequence, presumably one beginning at
zero.  Given this context, he argued that a server which accepted the
new
seqid for that owner, was being friendlier to the simple client, and
avoiding the extra round trips that would be necessary for the client to
find a owner string that the client was not still tracking.

Others argued strongly to the contrary.  There were two basic points of
disagreement with David R's position.  First, it was pointed out that,=20
while the spec allowed the server to forget about an open_owner at any=20
time there was no open state, in the case under discussion, that state=20
had not in fact been forgotten.  It was argued that this it dubious for=20
the server to act as if it had, despite the fact that an external=20
observer, constrained to look at the data exchanged on the wire, would=20
have no solid basis to impeach the server's choice.  The other issue=20
concerns the validity of the proposed simple client's approach to=20
management of open_owners.  This group relied on section 8.1.5 which=20
requires seqids for a given owner to be ascending and argued that=20
a client which did not maintain enough state to assure that was=20
simply broken, arguing that this constraint could be accommodated=20
relatively simply.  Although the specific details are too voluminous=20
to be cited here, it was also argued that a client which reset the=20
current seqid in this manner would break the at-most-once semantics=20
that the sequencing logic was intended to secure.=20

 =20


=09
.

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4


From nfsv4-bounces@ietf.org Tue May 31 18:40:22 2005
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32)
	id 1DdFPK-0007ve-Jk; Tue, 31 May 2005 18:40:22 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org)
	by megatron.ietf.org with esmtp (Exim 4.32) id 1DdFPH-0007vO-D8
	for nfsv4@megatron.ietf.org; Tue, 31 May 2005 18:40:20 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1])
	by ietf.org (8.9.1a/8.9.1a) with ESMTP id SAA06844
	for <nfsv4@ietf.org>; Tue, 31 May 2005 18:40:14 -0400 (EDT)
Received: from brmea-mail-3.sun.com ([192.18.98.34])
	by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DdFiv-0002Pc-Vz
	for nfsv4@ietf.org; Tue, 31 May 2005 19:00:39 -0400
Received: from centralmail1brm.Central.Sun.COM ([129.147.62.1])
	by brmea-mail-3.sun.com (8.12.10/8.12.9) with ESMTP id j4VMeEjO024993
	for <nfsv4@ietf.org>; Tue, 31 May 2005 16:40:14 -0600 (MDT)
Received: from leviathan.Central.Sun.COM (leviathan.Central.Sun.COM
	[129.153.128.98])
	by centralmail1brm.Central.Sun.COM (8.12.10+Sun/8.12.10/ENSMAIL,
	v2.2) with ESMTP id j4VMeA6v003091
	for <nfsv4@ietf.org>; Tue, 31 May 2005 16:40:11 -0600 (MDT)
Received: from leviathan.Central.Sun.COM (localhost [127.0.0.1])
	by leviathan.Central.Sun.COM (8.13.3+Sun/8.13.3) with ESMTP id
	j4VMa0uC015480; Tue, 31 May 2005 17:36:00 -0500 (CDT)
Received: (from rmesta@localhost)
	by leviathan.Central.Sun.COM (8.13.3+Sun/8.13.3/Submit) id
	j4VMZuaj015479; Tue, 31 May 2005 17:35:56 -0500 (CDT)
Date: Tue, 31 May 2005 17:35:56 -0500
From: Rick Mesta <Ricardo.Mesta@Sun.COM>
To: Andreas Gustafsson <gson@araneus.fi>, Rob Austein <sra@isc.org>
Message-ID: <20050531223556.GA15396@leviathan.sun.com>
References: <20050509150200.28002bb0.olaf@ripe.net>
	<20050518163239.GA10343@minas-tirith.central.sun.com>
	<200505271120.j4RBKvun019031@guava.araneus.fi>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200505271120.j4RBKvun019031@guava.araneus.fi>
User-Agent: Mutt/1.4.2.1i
X-Spam-Score: 0.0 (/)
X-Scan-Signature: bdc523f9a54890b8a30dd6fd53d5d024
Cc: namedroppers@ops.ietf.org, Olaf Kolkman <okolkman@ripe.net>,
	Rick Mesta <Ricardo.Mesta@Sun.COM>,
	Spencer Shepler <Spencer.Shepler@Sun.COM>,
	Brian Pawlowski <beepy@netapp.com>, nfsv4@ietf.org,
	Olafur Gudmundsson <ogud@ogud.com>
Subject: [nfsv4] Re: draft-mesta-nfs4id-dns-rr-01
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Rick Mesta <Ricardo.Mesta@Sun.COM>
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/nfsv4>,
	<mailto:nfsv4-request@ietf.org?subject=subscribe>
Sender: nfsv4-bounces@ietf.org
Errors-To: nfsv4-bounces@ietf.org


	Hi Rob / Andreas,

	 I believe we're slowly diverging from the original intent
	of this draft.

	 In the I-D, I incorrectly emphasized the aspect of "default
	domain" and how it is derived when using it as a motivating
	example. It is primarily an implementation suggestion to use
	the DNS domain as the NFSv4 domain in the absence of other
	configuration information. I would have been more correct to
	talk about the use of the "current DNS domain".

 	 What we're really trying to accomplish by the introduction
	of the NFS4ID RR is to provide a way to determine the NFSv4
	domain name for a given DNS domain.

	 NFSv4 clients and servers may interact with multiple other
	clients and servers in many different DNS domains. While it
	is suggested that the NFSv4 domain be exactly the DNS domain,
	it is not required. Therefore, for the system to use the
	correct domain for user and group, it needs to know what the
	NFSv4 domain, for the given DNS domain, is.

	DHCP is primarily for providing per-host configuration
	parameters. Not unlike an MX record, the NFSv4 domain is
	a per DNS domain parameter.

	Using a DNS record allows:

        1) A DNS domain wide configuration of a NFSv4 domain name
	   that is different from the DNS domain. For example, a
	   domain hierarchy that has a flat user space may want a
	   common "foo.com" NFSv4 domain similar to how organizations
	   place all e-mail accounts in the highest common domain.

        2) An NFSv4 node may be interacting with multiple other
	   nodes in different domains and need to know the other's
	   NFSv4 domain name. The interaction may be dynamic and
	   such that there is no apriori knowledge of which domains
	   it will access. Thus it is not reasonable to configure
	   DHCP to provide all possible domains via DHCP.

	3) NFSv4 is not necessarily a single per-host service. There
	   are user level (e.g. browsers) implementations that may
	   not desire to have the common per-host DHCP parameters.

	The NFSv4 domain name is a per domain network parameter that
	needs also to be visible between domains. Therefore, a per
	host protocol such as DHCP is not appropriate.

		rick


On Fri, May 27, 2005 at 02:20:57PM +0300, Andreas Gustafsson wrote:
| I have some comments on draft-mesta-nfs4id-dns-rr-01.
| 
| As others have pointed out, the DNS protocol and the DNS standards
| have no concept of a "default domain".  Some stub resolver
| implementations do, but even there it tends to be deprecated in favor
| of a "domain search list".
| 
| It seems to me that there is no need to create, or any advantage to
| creating, a connection or dependency between NFSv4 and the DNS other
| than simply saying that the thing following the @ has the syntactic
| form of a domain name, and _by convention_ it is equal to some DNS
| domain belonging to the organization at hand.
| 
| Rather than tying the NFSv4 ID to the ill-defined concept of a
| "default domain", it should simply be configured separately, using the
| same mechanisms used to configure other network parameters of the
| host.  Typically this would be DHCP, or if DHCP cannot be used, a
| configuration file.  Instead of defining a NFS4ID RR type, it would
| therefore make more sense to define a NFS4ID DHCP option.
| 
| Regards,
| -- 
| Andreas Gustafsson, gson@araneus.fi

-- 

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4