Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

author Linus Torvalds <torvalds@linux-foundation.org>

Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)

committer Linus Torvalds <torvalds@linux-foundation.org>

Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)
author Linus Torvalds <torvalds@linux-foundation.org>
Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)
committer Linus Torvalds <torvalds@linux-foundation.org>
Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt

index 120fd3cf7fd92b666cfcf7ea8eda236e01e22282..fe03d10bb79a36055401b8b3475f57bf57ea38c3 100644 (file)
--- a/Documentation/filesystems/nfs/idmapper.txt
+++ b/Documentation/filesystems/nfs/idmapper.txt
@@ -4,13 +4,21 @@ ID Mapper
  =========
  Id mapper is used by NFS to translate user and group ids into names, and to
  translate user and group names into ids.  Part of this translation involves
-performing an upcall to userspace to request the information.  Id mapper will
-user request-key to perform this upcall and cache the result.  The program
-/usr/sbin/nfs.idmap should be called by request-key, and will perform the
-translation and initialize a key with the resulting information.
+performing an upcall to userspace to request the information.  There are two
+ways NFS could obtain this information: placing a call to /sbin/request-key
+or by placing a call to the rpc.idmap daemon.
+
+NFS will attempt to call /sbin/request-key first.  If this succeeds, the
+result will be cached using the generic request-key cache.  This call should
+only fail if /etc/request-key.conf is not configured for the id_resolver key
+type, see the "Configuring" section below if you wish to use the request-key
+method.
+
+If the call to /sbin/request-key fails (if /etc/request-key.conf is not
+configured with the id_resolver key type), then the idmapper will ask the
+legacy rpc.idmap daemon for the id mapping.  This result will be stored
+in a custom NFS idmap cache.
  
- NFS_USE_NEW_IDMAPPER must be selected when configuring the kernel to use this
- feature.
  
  ===========
  Configuring
diff --git a/Documentation/filesystems/nfs/pnfs.txt b/Documentation/filesystems/nfs/pnfs.txt

index 983e14abe7e9d282a9ae017ac34f8bd51749c3a4..c7919c6e3beabf9390714175cbc22d047b05025f 100644 (file)
--- a/Documentation/filesystems/nfs/pnfs.txt
+++ b/Documentation/filesystems/nfs/pnfs.txt
@@ -53,3 +53,57 @@ lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
  bit which holds it in the pnfs_layout_hdr's list.  When the final lseg
  is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
  bit is set, preventing any new lsegs from being added.
+
+layout drivers
+--------------
+
+PNFS utilizes what is called layout drivers. The STD defines 3 basic
+layout types: "files" "objects" and "blocks". For each of these types
+there is a layout-driver with a common function-vectors table which
+are called by the nfs-client pnfs-core to implement the different layout
+types.
+
+Files-layout-driver code is in: fs/nfs/nfs4filelayout.c && nfs4filelayoutdev.c
+Objects-layout-deriver code is in: fs/nfs/objlayout/.. directory
+Blocks-layout-deriver code is in: fs/nfs/blocklayout/.. directory
+
+objects-layout setup
+--------------------
+
+As part of the full STD implementation the objlayoutdriver.ko needs, at times,
+to automatically login to yet undiscovered iscsi/osd devices. For this the
+driver makes up-calles to a user-mode script called *osd_login*
+
+The path_name of the script to use is by default:
+       /sbin/osd_login.
+This name can be overridden by the Kernel module parameter:
+       objlayoutdriver.osd_login_prog
+
+If Kernel does not find the osd_login_prog path it will zero it out
+and will not attempt farther logins. An admin can then write new value
+to the objlayoutdriver.osd_login_prog Kernel parameter to re-enable it.
+
+The /sbin/osd_login is part of the nfs-utils package, and should usually
+be installed on distributions that support this Kernel version.
+
+The API to the login script is as follows:
+       Usage: $0 -u <URI> -o <OSDNAME> -s <SYSTEMID>
+       Options:
+               -u              target uri e.g. iscsi://<ip>:<port>
+                               (allways exists)
+                               (More protocols can be defined in the future.
+                                The client does not interpret this string it is
+                                passed unchanged as recieved from the Server)
+               -o              osdname of the requested target OSD
+                               (Might be empty)
+                               (A string which denotes the OSD name, there is a
+                                limit of 64 chars on this string)
+               -s              systemid of the requested target OSD
+                               (Might be empty)
+                               (This string, if not empty is always an hex
+                                representation of the 20 bytes osd_system_id)
+
+blocks-layout setup
+-------------------
+
+TODO: Document the setup needs of the blocks layout driver
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt

index 247dcfd62034612e09dce56d6d6fffff4f0c957a..7c33ef8a1ba952ade64bd3209f0669a80b2ab301 100644 (file)
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1672,6 +1672,14 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
                         of returning the full 64-bit number.
                         The default is to return 64-bit inode numbers.
  
+       nfs.max_session_slots=
+                       [NFSv4.1] Sets the maximum number of session slots
+                       the client will attempt to negotiate with the server.
+                       This limits the number of simultaneous RPC requests
+                       that the client can send to the NFSv4.1 server.
+                       Note that there is little point in setting this
+                       value higher than the max_tcp_slot_table_limit.
+
         nfs.nfs4_disable_idmapping=
                         [NFSv4] When set to the default of '1', this option
                         ensures that both the RPC level authentication
@@ -1685,6 +1693,21 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
                         back to using the idmapper.
                         To turn off this behaviour, set the value to '0'.
  
+       nfs.send_implementation_id =
+                       [NFSv4.1] Send client implementation identification
+                       information in exchange_id requests.
+                       If zero, no implementation identification information
+                       will be sent.
+                       The default is to send the implementation identification
+                       information.
+
+
+       objlayoutdriver.osd_login_prog=
+                       [NFS] [OBJLAYOUT] sets the pathname to the program which
+                       is used to automatically discover and login into new
+                       osd-targets. Please see:
+                       Documentation/filesystems/pnfs.txt for more explanations
+
         nmi_debug=      [KNL,AVR32,SH] Specify one or more actions to take
                         when a NMI is triggered.
                         Format: [state][,regs][,debounce][,die]
diff --git a/fs/lockd/clnt4xdr.c b/fs/lockd/clnt4xdr.c

index f848b52c67b19e565567168a2bd5810fcc5f0a6c..3ddcbb1c0a432728f626986b1d057d11aafdce84 100644 (file)
--- a/fs/lockd/clnt4xdr.c
+++ b/fs/lockd/clnt4xdr.c
@@ -598,7 +598,7 @@ static struct rpc_procinfo  nlm4_procedures[] = {
         PROC(GRANTED_RES,       res,            norep),
  };
  
-struct rpc_version     nlm_version4 = {
+const struct rpc_version nlm_version4 = {
         .number         = 4,
         .nrprocs        = ARRAY_SIZE(nlm4_procedures),
         .procs          = nlm4_procedures,
diff --git a/fs/lockd/clntlock.c b/fs/lockd/clntlock.c

index 8d4ea8351e3d4e093263104764d0f959aa35c05b..ba1dc2eebd1ef8413d0593abfde9e14229169ab3 100644 (file)
--- a/fs/lockd/clntlock.c
+++ b/fs/lockd/clntlock.c
@@ -62,7 +62,8 @@ struct nlm_host *nlmclnt_init(const struct nlmclnt_initdata *nlm_init)
  
         host = nlmclnt_lookup_host(nlm_init->address, nlm_init->addrlen,
                                    nlm_init->protocol, nlm_version,
-                                  nlm_init->hostname, nlm_init->noresvport);
+                                  nlm_init->hostname, nlm_init->noresvport,
+                                  nlm_init->net);
         if (host == NULL) {
                 lockd_down();
                 return ERR_PTR(-ENOLCK);
diff --git a/fs/lockd/clntxdr.c b/fs/lockd/clntxdr.c

index 180ac34feb9a8630e3bbeff633b06d858a1420f8..3d35e3e80c1ccfac1367647b6ac417ba3f5bd1b2 100644 (file)
--- a/fs/lockd/clntxdr.c
+++ b/fs/lockd/clntxdr.c
@@ -596,19 +596,19 @@ static struct rpc_procinfo        nlm_procedures[] = {
         PROC(GRANTED_RES,       res,            norep),
  };
  
-static struct rpc_version      nlm_version1 = {
+static const struct rpc_version        nlm_version1 = {
                 .number         = 1,
                 .nrprocs        = ARRAY_SIZE(nlm_procedures),
                 .procs          = nlm_procedures,
  };
  
-static struct rpc_version      nlm_version3 = {
+static const struct rpc_version        nlm_version3 = {
                 .number         = 3,
                 .nrprocs        = ARRAY_SIZE(nlm_procedures),
                 .procs          = nlm_procedures,
  };
  
-static struct rpc_version      *nlm_versions[] = {
+static const struct rpc_version        *nlm_versions[] = {
         [1] = &nlm_version1,
         [3] = &nlm_version3,
  #ifdef CONFIG_LOCKD_V4
@@ -618,7 +618,7 @@ static struct rpc_version   *nlm_versions[] = {
  
  static struct rpc_stat         nlm_rpc_stats;
  
-struct rpc_program             nlm_program = {
+const struct rpc_program       nlm_program = {
                 .name           = "lockd",
                 .number         = NLM_PROGRAM,
                 .nrvers         = ARRAY_SIZE(nlm_versions),
diff --git a/fs/lockd/host.c b/fs/lockd/host.c

index 6f29836ec0cbd81913cfda0f41852e3a4aa3fb4b..eb75ca7c2d6edd4782ad9f025b6115c78b3c9307 100644 (file)
--- a/fs/lockd/host.c
+++ b/fs/lockd/host.c
@@ -17,6 +17,8 @@
  #include <linux/lockd/lockd.h>
  #include <linux/mutex.h>
  
+#include <linux/sunrpc/svc_xprt.h>
+
  #include <net/ipv6.h>
  
  #define NLMDBG_FACILITY                NLMDBG_HOSTCACHE
@@ -54,6 +56,7 @@ struct nlm_lookup_host_info {
         const char              *hostname;      /* remote's hostname */
         const size_t            hostname_len;   /* it's length */
         const int               noresvport;     /* use non-priv port */
+       struct net              *net;           /* network namespace to bind */
  };
  
  /*
@@ -155,6 +158,7 @@ static struct nlm_host *nlm_alloc_host(struct nlm_lookup_host_info *ni,
         INIT_LIST_HEAD(&host->h_reclaim);
         host->h_nsmhandle  = nsm;
         host->h_addrbuf    = nsm->sm_addrbuf;
+       host->net          = ni->net;
  
  out:
         return host;
@@ -206,7 +210,8 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
                                      const unsigned short protocol,
                                      const u32 version,
                                      const char *hostname,
-                                    int noresvport)
+                                    int noresvport,
+                                    struct net *net)
  {
         struct nlm_lookup_host_info ni = {
                 .server         = 0,
@@ -217,6 +222,7 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
                 .hostname       = hostname,
                 .hostname_len   = strlen(hostname),
                 .noresvport     = noresvport,
+               .net            = net,
         };
         struct hlist_head *chain;
         struct hlist_node *pos;
@@ -231,6 +237,8 @@ struct nlm_host *nlmclnt_lookup_host(const struct sockaddr *sap,
  
         chain = &nlm_client_hosts[nlm_hash_address(sap)];
         hlist_for_each_entry(host, pos, chain, h_hash) {
+               if (host->net != net)
+                       continue;
                 if (!rpc_cmp_addr(nlm_addr(host), sap))
                         continue;
  
@@ -318,6 +326,7 @@ struct nlm_host *nlmsvc_lookup_host(const struct svc_rqst *rqstp,
         struct nsm_handle *nsm = NULL;
         struct sockaddr *src_sap = svc_daddr(rqstp);
         size_t src_len = rqstp->rq_daddrlen;
+       struct net *net = rqstp->rq_xprt->xpt_net;
         struct nlm_lookup_host_info ni = {
                 .server         = 1,
                 .sap            = svc_addr(rqstp),
@@ -326,6 +335,7 @@ struct nlm_host *nlmsvc_lookup_host(const struct svc_rqst *rqstp,
                 .version        = rqstp->rq_vers,
                 .hostname       = hostname,
                 .hostname_len   = hostname_len,
+               .net            = net,
         };
  
         dprintk("lockd: %s(host='%*s', vers=%u, proto=%s)\n", __func__,
@@ -339,6 +349,8 @@ struct nlm_host *nlmsvc_lookup_host(const struct svc_rqst *rqstp,
  
         chain = &nlm_server_hosts[nlm_hash_address(ni.sap)];
         hlist_for_each_entry(host, pos, chain, h_hash) {
+               if (host->net != net)
+                       continue;
                 if (!rpc_cmp_addr(nlm_addr(host), ni.sap))
                         continue;
  
@@ -431,7 +443,7 @@ nlm_bind_host(struct nlm_host *host)
                         .to_retries     = 5U,
                 };
                 struct rpc_create_args args = {
-                       .net            = &init_net,
+                       .net            = host->net,
                         .protocol       = host->h_proto,
                         .address        = nlm_addr(host),
                         .addrsize       = host->h_addrlen,
@@ -553,12 +565,8 @@ void nlm_host_rebooted(const struct nlm_reboot *info)
         nsm_release(nsm);
  }
  
-/*
- * Shut down the hosts module.
- * Note that this routine is called only at server shutdown time.
- */
  void
-nlm_shutdown_hosts(void)
+nlm_shutdown_hosts_net(struct net *net)
  {
         struct hlist_head *chain;
         struct hlist_node *pos;
@@ -570,6 +578,8 @@ nlm_shutdown_hosts(void)
         /* First, make all hosts eligible for gc */
         dprintk("lockd: nuking all hosts...\n");
         for_each_host(host, pos, chain, nlm_server_hosts) {
+               if (net && host->net != net)
+                       continue;
                 host->h_expires = jiffies - 1;
                 if (host->h_rpcclnt) {
                         rpc_shutdown_client(host->h_rpcclnt);
@@ -580,15 +590,29 @@ nlm_shutdown_hosts(void)
         /* Then, perform a garbage collection pass */
         nlm_gc_hosts();
         mutex_unlock(&nlm_host_mutex);
+}
+
+/*
+ * Shut down the hosts module.
+ * Note that this routine is called only at server shutdown time.
+ */
+void
+nlm_shutdown_hosts(void)
+{
+       struct hlist_head *chain;
+       struct hlist_node *pos;
+       struct nlm_host *host;
+
+       nlm_shutdown_hosts_net(NULL);
  
         /* complain if any hosts are left */
         if (nrhosts != 0) {
                 printk(KERN_WARNING "lockd: couldn't shutdown host module!\n");
                 dprintk("lockd: %lu hosts left:\n", nrhosts);
                 for_each_host(host, pos, chain, nlm_server_hosts) {
-                       dprintk("       %s (cnt %d use %d exp %ld)\n",
+                       dprintk("       %s (cnt %d use %d exp %ld net %p)\n",
                                 host->h_name, atomic_read(&host->h_count),
-                               host->h_inuse, host->h_expires);
+                               host->h_inuse, host->h_expires, host->net);
                 }
         }
  }
diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c

index 65ba36b80a9e1a482f838d0846a07d0ec800b015..7ef14b3c5bee9460609d863a973f0dfad93f163c 100644 (file)
--- a/fs/lockd/mon.c
+++ b/fs/lockd/mon.c
@@ -47,7 +47,7 @@ struct nsm_res {
         u32                     state;
  };
  
-static struct rpc_program      nsm_program;
+static const struct rpc_program        nsm_program;
  static                         LIST_HEAD(nsm_handles);
  static                         DEFINE_SPINLOCK(nsm_lock);
  
@@ -62,14 +62,14 @@ static inline struct sockaddr *nsm_addr(const struct nsm_handle *nsm)
         return (struct sockaddr *)&nsm->sm_addr;
  }
  
-static struct rpc_clnt *nsm_create(void)
+static struct rpc_clnt *nsm_create(struct net *net)
  {
         struct sockaddr_in sin = {
                 .sin_family             = AF_INET,
                 .sin_addr.s_addr        = htonl(INADDR_LOOPBACK),
         };
         struct rpc_create_args args = {
-               .net                    = &init_net,
+               .net                    = net,
                 .protocol               = XPRT_TRANSPORT_UDP,
                 .address                = (struct sockaddr *)&sin,
                 .addrsize               = sizeof(sin),
@@ -83,7 +83,8 @@ static struct rpc_clnt *nsm_create(void)
         return rpc_create(&args);
  }
  
-static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res)
+static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res,
+                        struct net *net)
  {
         struct rpc_clnt *clnt;
         int             status;
@@ -99,7 +100,7 @@ static int nsm_mon_unmon(struct nsm_handle *nsm, u32 proc, struct nsm_res *res)
                 .rpc_resp       = res,
         };
  
-       clnt = nsm_create();
+       clnt = nsm_create(net);
         if (IS_ERR(clnt)) {
                 status = PTR_ERR(clnt);
                 dprintk("lockd: failed to create NSM upcall transport, "
@@ -149,7 +150,7 @@ int nsm_monitor(const struct nlm_host *host)
          */
         nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name : nsm->sm_addrbuf;
  
-       status = nsm_mon_unmon(nsm, NSMPROC_MON, &res);
+       status = nsm_mon_unmon(nsm, NSMPROC_MON, &res, host->net);
         if (unlikely(res.status != 0))
                 status = -EIO;
         if (unlikely(status < 0)) {
@@ -183,7 +184,7 @@ void nsm_unmonitor(const struct nlm_host *host)
          && nsm->sm_monitored && !nsm->sm_sticky) {
                 dprintk("lockd: nsm_unmonitor(%s)\n", nsm->sm_name);
  
-               status = nsm_mon_unmon(nsm, NSMPROC_UNMON, &res);
+               status = nsm_mon_unmon(nsm, NSMPROC_UNMON, &res, host->net);
                 if (res.status != 0)
                         status = -EIO;
                 if (status < 0)
@@ -534,19 +535,19 @@ static struct rpc_procinfo        nsm_procedures[] = {
         },
  };
  
-static struct rpc_version      nsm_version1 = {
+static const struct rpc_version nsm_version1 = {
                 .number         = 1,
                 .nrprocs        = ARRAY_SIZE(nsm_procedures),
                 .procs          = nsm_procedures
  };
  
-static struct rpc_version *    nsm_version[] = {
+static const struct rpc_version *nsm_version[] = {
         [1] = &nsm_version1,
  };
  
  static struct rpc_stat         nsm_stats;
  
-static struct rpc_program      nsm_program = {
+static const struct rpc_program nsm_program = {
                 .name           = "statd",
                 .number         = NSM_PROGRAM,
                 .nrvers         = ARRAY_SIZE(nsm_version),
diff --git a/fs/lockd/netns.h b/fs/lockd/netns.h

new file mode 100644 (file)

index 0000000..ce227e0
--- /dev/null
+++ b/fs/lockd/netns.h
@@ -0,0 +1,12 @@
+#ifndef __LOCKD_NETNS_H__
+#define __LOCKD_NETNS_H__
+
+#include <net/netns/generic.h>
+
+struct lockd_net {
+       unsigned int nlmsvc_users;
+};
+
+extern int lockd_net_id;
+
+#endif
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c

index c061b9aa7ddb165c4b5241e73d9c8bb943f2f678..2774e1013b34467acc3c1c6bc55f47fcac8d3ca7 100644 (file)
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -35,6 +35,8 @@
  #include <linux/lockd/lockd.h>
  #include <linux/nfs.h>
  
+#include "netns.h"
+
  #define NLMDBG_FACILITY                NLMDBG_SVC
  #define LOCKD_BUFSIZE          (1024 + NLMSVC_XDRSIZE)
  #define ALLOWED_SIGS           (sigmask(SIGKILL))
@@ -50,6 +52,8 @@ static struct task_struct     *nlmsvc_task;
  static struct svc_rqst         *nlmsvc_rqst;
  unsigned long                  nlmsvc_timeout;
  
+int lockd_net_id;
+
  /*
   * These can be set at insmod time (useful for NFS as root filesystem),
   * and also changed through the sysctl interface.  -- Jamie Lokier, Aug 2003
@@ -189,27 +193,29 @@ lockd(void *vrqstp)
  }
  
  static int create_lockd_listener(struct svc_serv *serv, const char *name,
-                                const int family, const unsigned short port)
+                                struct net *net, const int family,
+                                const unsigned short port)
  {
         struct svc_xprt *xprt;
  
-       xprt = svc_find_xprt(serv, name, family, 0);
+       xprt = svc_find_xprt(serv, name, net, family, 0);
         if (xprt == NULL)
-               return svc_create_xprt(serv, name, &init_net, family, port,
+               return svc_create_xprt(serv, name, net, family, port,
                                                 SVC_SOCK_DEFAULTS);
         svc_xprt_put(xprt);
         return 0;
  }
  
-static int create_lockd_family(struct svc_serv *serv, const int family)
+static int create_lockd_family(struct svc_serv *serv, struct net *net,
+                              const int family)
  {
         int err;
  
-       err = create_lockd_listener(serv, "udp", family, nlm_udpport);
+       err = create_lockd_listener(serv, "udp", net, family, nlm_udpport);
         if (err < 0)
                 return err;
  
-       return create_lockd_listener(serv, "tcp", family, nlm_tcpport);
+       return create_lockd_listener(serv, "tcp", net, family, nlm_tcpport);
  }
  
  /*
@@ -222,16 +228,16 @@ static int create_lockd_family(struct svc_serv *serv, const int family)
   * Returns zero if all listeners are available; otherwise a
   * negative errno value is returned.
   */
-static int make_socks(struct svc_serv *serv)
+static int make_socks(struct svc_serv *serv, struct net *net)
  {
         static int warned;
         int err;
  
-       err = create_lockd_family(serv, PF_INET);
+       err = create_lockd_family(serv, net, PF_INET);
         if (err < 0)
                 goto out_err;
  
-       err = create_lockd_family(serv, PF_INET6);
+       err = create_lockd_family(serv, net, PF_INET6);
         if (err < 0 && err != -EAFNOSUPPORT)
                 goto out_err;
  
@@ -245,6 +251,47 @@ out_err:
         return err;
  }
  
+static int lockd_up_net(struct net *net)
+{
+       struct lockd_net *ln = net_generic(net, lockd_net_id);
+       struct svc_serv *serv = nlmsvc_rqst->rq_server;
+       int error;
+
+       if (ln->nlmsvc_users)
+               return 0;
+
+       error = svc_rpcb_setup(serv, net);
+       if (error)
+               goto err_rpcb;
+
+       error = make_socks(serv, net);
+       if (error < 0)
+               goto err_socks;
+       return 0;
+
+err_socks:
+       svc_rpcb_cleanup(serv, net);
+err_rpcb:
+       return error;
+}
+
+static void lockd_down_net(struct net *net)
+{
+       struct lockd_net *ln = net_generic(net, lockd_net_id);
+       struct svc_serv *serv = nlmsvc_rqst->rq_server;
+
+       if (ln->nlmsvc_users) {
+               if (--ln->nlmsvc_users == 0) {
+                       nlm_shutdown_hosts_net(net);
+                       svc_shutdown_net(serv, net);
+               }
+       } else {
+               printk(KERN_ERR "lockd_down_net: no users! task=%p, net=%p\n",
+                               nlmsvc_task, net);
+               BUG();
+       }
+}
+
  /*
   * Bring up the lockd process if it's not already up.
   */
@@ -252,13 +299,16 @@ int lockd_up(void)
  {
         struct svc_serv *serv;
         int             error = 0;
+       struct net *net = current->nsproxy->net_ns;
  
         mutex_lock(&nlmsvc_mutex);
         /*
          * Check whether we're already up and running.
          */
-       if (nlmsvc_rqst)
+       if (nlmsvc_rqst) {
+               error = lockd_up_net(net);
                 goto out;
+       }
  
         /*
          * Sanity check: if there's no pid,
@@ -275,7 +325,7 @@ int lockd_up(void)
                 goto out;
         }
  
-       error = make_socks(serv);
+       error = make_socks(serv, net);
         if (error < 0)
                 goto destroy_and_out;
  
@@ -313,8 +363,12 @@ int lockd_up(void)
  destroy_and_out:
         svc_destroy(serv);
  out:
-       if (!error)
+       if (!error) {
+               struct lockd_net *ln = net_generic(net, lockd_net_id);
+
+               ln->nlmsvc_users++;
                 nlmsvc_users++;
+       }
         mutex_unlock(&nlmsvc_mutex);
         return error;
  }
@@ -328,8 +382,10 @@ lockd_down(void)
  {
         mutex_lock(&nlmsvc_mutex);
         if (nlmsvc_users) {
-               if (--nlmsvc_users)
+               if (--nlmsvc_users) {
+                       lockd_down_net(current->nsproxy->net_ns);
                         goto out;
+               }
         } else {
                 printk(KERN_ERR "lockd_down: no users! task=%p\n",
                         nlmsvc_task);
@@ -497,24 +553,55 @@ module_param_call(nlm_tcpport, param_set_port, param_get_int,
  module_param(nsm_use_hostnames, bool, 0644);
  module_param(nlm_max_connections, uint, 0644);
  
+static int lockd_init_net(struct net *net)
+{
+       return 0;
+}
+
+static void lockd_exit_net(struct net *net)
+{
+}
+
+static struct pernet_operations lockd_net_ops = {
+       .init = lockd_init_net,
+       .exit = lockd_exit_net,
+       .id = &lockd_net_id,
+       .size = sizeof(struct lockd_net),
+};
+
+
  /*
   * Initialising and terminating the module.
   */
  
  static int __init init_nlm(void)
  {
+       int err;
+
  #ifdef CONFIG_SYSCTL
+       err = -ENOMEM;
         nlm_sysctl_table = register_sysctl_table(nlm_sysctl_root);
-       return nlm_sysctl_table ? 0 : -ENOMEM;
-#else
+       if (nlm_sysctl_table == NULL)
+               goto err_sysctl;
+#endif
+       err = register_pernet_subsys(&lockd_net_ops);
+       if (err)
+               goto err_pernet;
         return 0;
+
+err_pernet:
+#ifdef CONFIG_SYSCTL
+       unregister_sysctl_table(nlm_sysctl_table);
  #endif
+err_sysctl:
+       return err;
  }
  
  static void __exit exit_nlm(void)
  {
         /* FIXME: delete all NLM clients */
         nlm_shutdown_hosts();
+       unregister_pernet_subsys(&lockd_net_ops);
  #ifdef CONFIG_SYSCTL
         unregister_sysctl_table(nlm_sysctl_table);
  #endif
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c

index f0179c3745d27936f850c46d25ee7f6418f13ea7..e46353f41a4202ec2138998ab5449b833dc139db 100644 (file)
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -46,7 +46,6 @@ static void   nlmsvc_remove_block(struct nlm_block *block);
  static int nlmsvc_setgrantargs(struct nlm_rqst *call, struct nlm_lock *lock);
  static void nlmsvc_freegrantargs(struct nlm_rqst *call);
  static const struct rpc_call_ops nlmsvc_grant_ops;
-static const char *nlmdbg_cookie2a(const struct nlm_cookie *cookie);
  
  /*
   * The list of blocked locks to retry
@@ -54,6 +53,35 @@ static const char *nlmdbg_cookie2a(const struct nlm_cookie *cookie);
  static LIST_HEAD(nlm_blocked);
  static DEFINE_SPINLOCK(nlm_blocked_lock);
  
+#ifdef LOCKD_DEBUG
+static const char *nlmdbg_cookie2a(const struct nlm_cookie *cookie)
+{
+       /*
+        * We can get away with a static buffer because we're only
+        * called with BKL held.
+        */
+       static char buf[2*NLM_MAXCOOKIELEN+1];
+       unsigned int i, len = sizeof(buf);
+       char *p = buf;
+
+       len--;  /* allow for trailing \0 */
+       if (len < 3)
+               return "???";
+       for (i = 0 ; i < cookie->len ; i++) {
+               if (len < 2) {
+                       strcpy(p-3, "...");
+                       break;
+               }
+               sprintf(p, "%02x", cookie->data[i]);
+               p += 2;
+               len -= 2;
+       }
+       *p = '\0';
+
+       return buf;
+}
+#endif
+
  /*
   * Insert a blocked lock into the global list
   */
@@ -935,32 +963,3 @@ nlmsvc_retry_blocked(void)
  
         return timeout;
  }
-
-#ifdef RPC_DEBUG
-static const char *nlmdbg_cookie2a(const struct nlm_cookie *cookie)
-{
-       /*
-        * We can get away with a static buffer because we're only
-        * called with BKL held.
-        */
-       static char buf[2*NLM_MAXCOOKIELEN+1];
-       unsigned int i, len = sizeof(buf);
-       char *p = buf;
-
-       len--;  /* allow for trailing \0 */
-       if (len < 3)
-               return "???";
-       for (i = 0 ; i < cookie->len ; i++) {
-               if (len < 2) {
-                       strcpy(p-3, "...");
-                       break;
-               }
-               sprintf(p, "%02x", cookie->data[i]);
-               p += 2;
-               len -= 2;
-       }
-       *p = '\0';
-
-       return buf;
-}
-#endif
diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig

index dbcd82126aed309026dc55f33754f8209ceaaeb5..2a0e6c599147aac9e66a5969c9c00593aa0dd380 100644 (file)
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -64,6 +64,7 @@ config NFS_V4
         bool "NFS client support for NFS version 4"
         depends on NFS_FS
         select SUNRPC_GSS
+       select KEYS
         help
           This option enables support for version 4 of the NFS protocol
           (RFC 3530) in the kernel's NFS client.
@@ -98,6 +99,18 @@ config PNFS_OBJLAYOUT
         depends on NFS_FS && NFS_V4_1 && SCSI_OSD_ULD
         default m
  
+config NFS_V4_1_IMPLEMENTATION_ID_DOMAIN
+       string "NFSv4.1 Implementation ID Domain"
+       depends on NFS_V4_1
+       default "kernel.org"
+       help
+         This option defines the domain portion of the implementation ID that
+         may be sent in the NFS exchange_id operation.  The value must be in
+         the format of a DNS domain name and should be set to the DNS domain
+         name of the distribution.
+         If the NFS client is unchanged from the upstream kernel, this
+         option should be set to the default "kernel.org".
+
  config ROOT_NFS
         bool "Root file system on NFS"
         depends on NFS_FS=y && IP_PNP
@@ -130,16 +143,10 @@ config NFS_USE_KERNEL_DNS
         bool
         depends on NFS_V4 && !NFS_USE_LEGACY_DNS
         select DNS_RESOLVER
-       select KEYS
         default y
  
-config NFS_USE_NEW_IDMAPPER
-       bool "Use the new idmapper upcall routine"
-       depends on NFS_V4 && KEYS
-       help
-         Say Y here if you want NFS to use the new idmapper upcall functions.
-         You will need /sbin/request-key (usually provided by the keyutils
-         package).  For details, read
-         <file:Documentation/filesystems/nfs/idmapper.txt>.
-
-         If you are unsure, say N.
+config NFS_DEBUG
+       bool
+       depends on NFS_FS && SUNRPC_DEBUG
+       select CRC32
+       default y
diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c

index 48cfac31f64ce2b3679362b91f324ff9afc4e262..9c94297bb70e9502c40825249eecb24093e142d2 100644 (file)
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -46,9 +46,6 @@ MODULE_LICENSE("GPL");
  MODULE_AUTHOR("Andy Adamson <andros@citi.umich.edu>");
  MODULE_DESCRIPTION("The NFSv4.1 pNFS Block layout driver");
  
-struct dentry *bl_device_pipe;
-wait_queue_head_t bl_wq;
-
  static void print_page(struct page *page)
  {
         dprintk("PRINTPAGE page %p\n", page);
@@ -236,12 +233,11 @@ bl_read_pagelist(struct nfs_read_data *rdata)
         sector_t isect, extent_length = 0;
         struct parallel_io *par;
         loff_t f_offset = rdata->args.offset;
-       size_t count = rdata->args.count;
         struct page **pages = rdata->args.pages;
         int pg_index = rdata->args.pgbase >> PAGE_CACHE_SHIFT;
  
-       dprintk("%s enter nr_pages %u offset %lld count %Zd\n", __func__,
-              rdata->npages, f_offset, count);
+       dprintk("%s enter nr_pages %u offset %lld count %u\n", __func__,
+              rdata->npages, f_offset, (unsigned int)rdata->args.count);
  
         par = alloc_parallel(rdata);
         if (!par)
@@ -1025,10 +1021,128 @@ static const struct rpc_pipe_ops bl_upcall_ops = {
         .destroy_msg    = bl_pipe_destroy_msg,
  };
  
+static struct dentry *nfs4blocklayout_register_sb(struct super_block *sb,
+                                           struct rpc_pipe *pipe)
+{
+       struct dentry *dir, *dentry;
+
+       dir = rpc_d_lookup_sb(sb, NFS_PIPE_DIRNAME);
+       if (dir == NULL)
+               return ERR_PTR(-ENOENT);
+       dentry = rpc_mkpipe_dentry(dir, "blocklayout", NULL, pipe);
+       dput(dir);
+       return dentry;
+}
+
+static void nfs4blocklayout_unregister_sb(struct super_block *sb,
+                                         struct rpc_pipe *pipe)
+{
+       if (pipe->dentry)
+               rpc_unlink(pipe->dentry);
+}
+
+static int rpc_pipefs_event(struct notifier_block *nb, unsigned long event,
+                          void *ptr)
+{
+       struct super_block *sb = ptr;
+       struct net *net = sb->s_fs_info;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct dentry *dentry;
+       int ret = 0;
+
+       if (!try_module_get(THIS_MODULE))
+               return 0;
+
+       if (nn->bl_device_pipe == NULL) {
+               module_put(THIS_MODULE);
+               return 0;
+       }
+
+       switch (event) {
+       case RPC_PIPEFS_MOUNT:
+               dentry = nfs4blocklayout_register_sb(sb, nn->bl_device_pipe);
+               if (IS_ERR(dentry)) {
+                       ret = PTR_ERR(dentry);
+                       break;
+               }
+               nn->bl_device_pipe->dentry = dentry;
+               break;
+       case RPC_PIPEFS_UMOUNT:
+               if (nn->bl_device_pipe->dentry)
+                       nfs4blocklayout_unregister_sb(sb, nn->bl_device_pipe);
+               break;
+       default:
+               ret = -ENOTSUPP;
+               break;
+       }
+       module_put(THIS_MODULE);
+       return ret;
+}
+
+static struct notifier_block nfs4blocklayout_block = {
+       .notifier_call = rpc_pipefs_event,
+};
+
+static struct dentry *nfs4blocklayout_register_net(struct net *net,
+                                                  struct rpc_pipe *pipe)
+{
+       struct super_block *pipefs_sb;
+       struct dentry *dentry;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (!pipefs_sb)
+               return NULL;
+       dentry = nfs4blocklayout_register_sb(pipefs_sb, pipe);
+       rpc_put_sb_net(net);
+       return dentry;
+}
+
+static void nfs4blocklayout_unregister_net(struct net *net,
+                                          struct rpc_pipe *pipe)
+{
+       struct super_block *pipefs_sb;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               nfs4blocklayout_unregister_sb(pipefs_sb, pipe);
+               rpc_put_sb_net(net);
+       }
+}
+
+static int nfs4blocklayout_net_init(struct net *net)
+{
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct dentry *dentry;
+
+       init_waitqueue_head(&nn->bl_wq);
+       nn->bl_device_pipe = rpc_mkpipe_data(&bl_upcall_ops, 0);
+       if (IS_ERR(nn->bl_device_pipe))
+               return PTR_ERR(nn->bl_device_pipe);
+       dentry = nfs4blocklayout_register_net(net, nn->bl_device_pipe);
+       if (IS_ERR(dentry)) {
+               rpc_destroy_pipe_data(nn->bl_device_pipe);
+               return PTR_ERR(dentry);
+       }
+       nn->bl_device_pipe->dentry = dentry;
+       return 0;
+}
+
+static void nfs4blocklayout_net_exit(struct net *net)
+{
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+
+       nfs4blocklayout_unregister_net(net, nn->bl_device_pipe);
+       rpc_destroy_pipe_data(nn->bl_device_pipe);
+       nn->bl_device_pipe = NULL;
+}
+
+static struct pernet_operations nfs4blocklayout_net_ops = {
+       .init = nfs4blocklayout_net_init,
+       .exit = nfs4blocklayout_net_exit,
+};
+
  static int __init nfs4blocklayout_init(void)
  {
-       struct vfsmount *mnt;
-       struct path path;
         int ret;
  
         dprintk("%s: NFSv4 Block Layout Driver Registering...\n", __func__);
@@ -1037,32 +1151,17 @@ static int __init nfs4blocklayout_init(void)
         if (ret)
                 goto out;
  
-       init_waitqueue_head(&bl_wq);
-
-       mnt = rpc_get_mount();
-       if (IS_ERR(mnt)) {
-               ret = PTR_ERR(mnt);
+       ret = rpc_pipefs_notifier_register(&nfs4blocklayout_block);
+       if (ret)
                 goto out_remove;
-       }
-
-       ret = vfs_path_lookup(mnt->mnt_root,
-                             mnt,
-                             NFS_PIPE_DIRNAME, 0, &path);
+       ret = register_pernet_subsys(&nfs4blocklayout_net_ops);
         if (ret)
-               goto out_putrpc;
-
-       bl_device_pipe = rpc_mkpipe(path.dentry, "blocklayout", NULL,
-                                   &bl_upcall_ops, 0);
-       path_put(&path);
-       if (IS_ERR(bl_device_pipe)) {
-               ret = PTR_ERR(bl_device_pipe);
-               goto out_putrpc;
-       }
+               goto out_notifier;
  out:
         return ret;
  
-out_putrpc:
-       rpc_put_mount();
+out_notifier:
+       rpc_pipefs_notifier_unregister(&nfs4blocklayout_block);
  out_remove:
         pnfs_unregister_layoutdriver(&blocklayout_type);
         return ret;
@@ -1073,9 +1172,9 @@ static void __exit nfs4blocklayout_exit(void)
         dprintk("%s: NFSv4 Block Layout Driver Unregistering...\n",
                __func__);
  
+       rpc_pipefs_notifier_unregister(&nfs4blocklayout_block);
+       unregister_pernet_subsys(&nfs4blocklayout_net_ops);
         pnfs_unregister_layoutdriver(&blocklayout_type);
-       rpc_unlink(bl_device_pipe);
-       rpc_put_mount();
  }
  
  MODULE_ALIAS("nfs-layouttype4-3");
diff --git a/fs/nfs/blocklayout/blocklayout.h b/fs/nfs/blocklayout/blocklayout.h

index e31a2df28e70aca040560b8d94403d85d67cd170..03350690118e239161fceb18e5939b97d7e062b4 100644 (file)
--- a/fs/nfs/blocklayout/blocklayout.h
+++ b/fs/nfs/blocklayout/blocklayout.h
@@ -37,6 +37,7 @@
  #include <linux/sunrpc/rpc_pipe_fs.h>
  
  #include "../pnfs.h"
+#include "../netns.h"
  
  #define PAGE_CACHE_SECTORS (PAGE_CACHE_SIZE >> SECTOR_SHIFT)
  #define PAGE_CACHE_SECTOR_SHIFT (PAGE_CACHE_SHIFT - SECTOR_SHIFT)
@@ -50,6 +51,7 @@ struct pnfs_block_dev {
         struct list_head                bm_node;
         struct nfs4_deviceid            bm_mdevid;    /* associated devid */
         struct block_device             *bm_mdev;     /* meta device itself */
+       struct net                      *net;
  };
  
  enum exstate4 {
@@ -151,9 +153,9 @@ BLK_LSEG2EXT(struct pnfs_layout_segment *lseg)
         return BLK_LO2EXT(lseg->pls_layout);
  }
  
-struct bl_dev_msg {
-       int32_t status;
-       uint32_t major, minor;
+struct bl_pipe_msg {
+       struct rpc_pipe_msg msg;
+       wait_queue_head_t *bl_wq;
  };
  
  struct bl_msg_hdr {
@@ -161,9 +163,6 @@ struct bl_msg_hdr {
         u16 totallen; /* length of entire message, including hdr itself */
  };
  
-extern struct dentry *bl_device_pipe;
-extern wait_queue_head_t bl_wq;
-
  #define BL_DEVICE_UMOUNT               0x0 /* Umount--delete devices */
  #define BL_DEVICE_MOUNT                0x1 /* Mount--create devices*/
  #define BL_DEVICE_REQUEST_INIT         0x0 /* Start request */
diff --git a/fs/nfs/blocklayout/blocklayoutdev.c b/fs/nfs/blocklayout/blocklayoutdev.c

index d08ba9107fde2fa5c608a9b581cc4f7e12e5cb4f..a5c88a554d921455256bb4dbeea7eaa498da5499 100644 (file)
--- a/fs/nfs/blocklayout/blocklayoutdev.c
+++ b/fs/nfs/blocklayout/blocklayoutdev.c
@@ -46,7 +46,7 @@ static int decode_sector_number(__be32 **rp, sector_t *sp)
  
         *rp = xdr_decode_hyper(*rp, &s);
         if (s & 0x1ff) {
-               printk(KERN_WARNING "%s: sector not aligned\n", __func__);
+               printk(KERN_WARNING "NFS: %s: sector not aligned\n", __func__);
                 return -1;
         }
         *sp = s >> SECTOR_SHIFT;
@@ -79,27 +79,30 @@ int nfs4_blkdev_put(struct block_device *bdev)
         return blkdev_put(bdev, FMODE_READ);
  }
  
-static struct bl_dev_msg bl_mount_reply;
-
  ssize_t bl_pipe_downcall(struct file *filp, const char __user *src,
                          size_t mlen)
  {
+       struct nfs_net *nn = net_generic(filp->f_dentry->d_sb->s_fs_info,
+                                        nfs_net_id);
+
         if (mlen != sizeof (struct bl_dev_msg))
                 return -EINVAL;
  
-       if (copy_from_user(&bl_mount_reply, src, mlen) != 0)
+       if (copy_from_user(&nn->bl_mount_reply, src, mlen) != 0)
                 return -EFAULT;
  
-       wake_up(&bl_wq);
+       wake_up(&nn->bl_wq);
  
         return mlen;
  }
  
  void bl_pipe_destroy_msg(struct rpc_pipe_msg *msg)
  {
+       struct bl_pipe_msg *bl_pipe_msg = container_of(msg, struct bl_pipe_msg, msg);
+
         if (msg->errno >= 0)
                 return;
-       wake_up(&bl_wq);
+       wake_up(bl_pipe_msg->bl_wq);
  }
  
  /*
@@ -111,29 +114,33 @@ nfs4_blk_decode_device(struct nfs_server *server,
  {
         struct pnfs_block_dev *rv;
         struct block_device *bd = NULL;
-       struct rpc_pipe_msg msg;
+       struct bl_pipe_msg bl_pipe_msg;
+       struct rpc_pipe_msg *msg = &bl_pipe_msg.msg;
         struct bl_msg_hdr bl_msg = {
                 .type = BL_DEVICE_MOUNT,
                 .totallen = dev->mincount,
         };
         uint8_t *dataptr;
         DECLARE_WAITQUEUE(wq, current);
-       struct bl_dev_msg *reply = &bl_mount_reply;
         int offset, len, i, rc;
+       struct net *net = server->nfs_client->net;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct bl_dev_msg *reply = &nn->bl_mount_reply;
  
         dprintk("%s CREATING PIPEFS MESSAGE\n", __func__);
         dprintk("%s: deviceid: %s, mincount: %d\n", __func__, dev->dev_id.data,
                 dev->mincount);
  
-       memset(&msg, 0, sizeof(msg));
-       msg.data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
-       if (!msg.data) {
+       bl_pipe_msg.bl_wq = &nn->bl_wq;
+       memset(msg, 0, sizeof(*msg));
+       msg->data = kzalloc(sizeof(bl_msg) + dev->mincount, GFP_NOFS);
+       if (!msg->data) {
                 rv = ERR_PTR(-ENOMEM);
                 goto out;
         }
  
-       memcpy(msg.data, &bl_msg, sizeof(bl_msg));
-       dataptr = (uint8_t *) msg.data;
+       memcpy(msg->data, &bl_msg, sizeof(bl_msg));
+       dataptr = (uint8_t *) msg->data;
         len = dev->mincount;
         offset = sizeof(bl_msg);
         for (i = 0; len > 0; i++) {
@@ -142,13 +149,13 @@ nfs4_blk_decode_device(struct nfs_server *server,
                 len -= PAGE_CACHE_SIZE;
                 offset += PAGE_CACHE_SIZE;
         }
-       msg.len = sizeof(bl_msg) + dev->mincount;
+       msg->len = sizeof(bl_msg) + dev->mincount;
  
         dprintk("%s CALLING USERSPACE DAEMON\n", __func__);
-       add_wait_queue(&bl_wq, &wq);
-       rc = rpc_queue_upcall(bl_device_pipe->d_inode, &msg);
+       add_wait_queue(&nn->bl_wq, &wq);
+       rc = rpc_queue_upcall(nn->bl_device_pipe, msg);
         if (rc < 0) {
-               remove_wait_queue(&bl_wq, &wq);
+               remove_wait_queue(&nn->bl_wq, &wq);
                 rv = ERR_PTR(rc);
                 goto out;
         }
@@ -156,7 +163,7 @@ nfs4_blk_decode_device(struct nfs_server *server,
         set_current_state(TASK_UNINTERRUPTIBLE);
         schedule();
         __set_current_state(TASK_RUNNING);
-       remove_wait_queue(&bl_wq, &wq);
+       remove_wait_queue(&nn->bl_wq, &wq);
  
         if (reply->status != BL_DEVICE_REQUEST_PROC) {
                 dprintk("%s failed to open device: %d\n",
@@ -181,13 +188,14 @@ nfs4_blk_decode_device(struct nfs_server *server,
  
         rv->bm_mdev = bd;
         memcpy(&rv->bm_mdevid, &dev->dev_id, sizeof(struct nfs4_deviceid));
+       rv->net = net;
         dprintk("%s Created device %s with bd_block_size %u\n",
                 __func__,
                 bd->bd_disk->disk_name,
                 bd->bd_block_size);
  
  out:
-       kfree(msg.data);
+       kfree(msg->data);
         return rv;
  }
  
diff --git a/fs/nfs/blocklayout/blocklayoutdm.c b/fs/nfs/blocklayout/blocklayoutdm.c

index d055c75580734853a29ae2b3553d5c0268bf44f9..737d839bc17b5aa0ae58e5350a235af1f8adfb2b 100644 (file)
--- a/fs/nfs/blocklayout/blocklayoutdm.c
+++ b/fs/nfs/blocklayout/blocklayoutdm.c
@@ -38,9 +38,10 @@
  
  #define NFSDBG_FACILITY         NFSDBG_PNFS_LD
  
-static void dev_remove(dev_t dev)
+static void dev_remove(struct net *net, dev_t dev)
  {
-       struct rpc_pipe_msg msg;
+       struct bl_pipe_msg bl_pipe_msg;
+       struct rpc_pipe_msg *msg = &bl_pipe_msg.msg;
         struct bl_dev_msg bl_umount_request;
         struct bl_msg_hdr bl_msg = {
                 .type = BL_DEVICE_UMOUNT,
@@ -48,36 +49,38 @@ static void dev_remove(dev_t dev)
         };
         uint8_t *dataptr;
         DECLARE_WAITQUEUE(wq, current);
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
  
         dprintk("Entering %s\n", __func__);
  
-       memset(&msg, 0, sizeof(msg));
-       msg.data = kzalloc(1 + sizeof(bl_umount_request), GFP_NOFS);
-       if (!msg.data)
+       bl_pipe_msg.bl_wq = &nn->bl_wq;
+       memset(msg, 0, sizeof(*msg));
+       msg->data = kzalloc(1 + sizeof(bl_umount_request), GFP_NOFS);
+       if (!msg->data)
                 goto out;
  
         memset(&bl_umount_request, 0, sizeof(bl_umount_request));
         bl_umount_request.major = MAJOR(dev);
         bl_umount_request.minor = MINOR(dev);
  
-       memcpy(msg.data, &bl_msg, sizeof(bl_msg));
-       dataptr = (uint8_t *) msg.data;
+       memcpy(msg->data, &bl_msg, sizeof(bl_msg));
+       dataptr = (uint8_t *) msg->data;
         memcpy(&dataptr[sizeof(bl_msg)], &bl_umount_request, sizeof(bl_umount_request));
-       msg.len = sizeof(bl_msg) + bl_msg.totallen;
+       msg->len = sizeof(bl_msg) + bl_msg.totallen;
  
-       add_wait_queue(&bl_wq, &wq);
-       if (rpc_queue_upcall(bl_device_pipe->d_inode, &msg) < 0) {
-               remove_wait_queue(&bl_wq, &wq);
+       add_wait_queue(&nn->bl_wq, &wq);
+       if (rpc_queue_upcall(nn->bl_device_pipe, msg) < 0) {
+               remove_wait_queue(&nn->bl_wq, &wq);
                 goto out;
         }
  
         set_current_state(TASK_UNINTERRUPTIBLE);
         schedule();
         __set_current_state(TASK_RUNNING);
-       remove_wait_queue(&bl_wq, &wq);
+       remove_wait_queue(&nn->bl_wq, &wq);
  
  out:
-       kfree(msg.data);
+       kfree(msg->data);
  }
  
  /*
@@ -90,10 +93,10 @@ static void nfs4_blk_metadev_release(struct pnfs_block_dev *bdev)
         dprintk("%s Releasing\n", __func__);
         rv = nfs4_blkdev_put(bdev->bm_mdev);
         if (rv)
-               printk(KERN_ERR "%s nfs4_blkdev_put returns %d\n",
+               printk(KERN_ERR "NFS: %s nfs4_blkdev_put returns %d\n",
                                 __func__, rv);
  
-       dev_remove(bdev->bm_mdev->bd_dev);
+       dev_remove(bdev->net, bdev->bm_mdev->bd_dev);
  }
  
  void bl_free_block_dev(struct pnfs_block_dev *bdev)
diff --git a/fs/nfs/blocklayout/extents.c b/fs/nfs/blocklayout/extents.c

index 1abac09f7cd5f9fd46cc07401873067e49a4b7f7..1f9a6032796b0ff239f2337fae7b656deb71ced6 100644 (file)
--- a/fs/nfs/blocklayout/extents.c
+++ b/fs/nfs/blocklayout/extents.c
@@ -147,7 +147,7 @@ static int _preload_range(struct pnfs_inval_markings *marks,
         count = (int)(end - start) / (int)tree->mtt_step_size;
  
         /* Pre-malloc what memory we might need */
-       storage = kmalloc(sizeof(*storage) * count, GFP_NOFS);
+       storage = kcalloc(count, sizeof(*storage), GFP_NOFS);
         if (!storage)
                 return -ENOMEM;
         for (i = 0; i < count; i++) {
diff --git a/fs/nfs/cache_lib.c b/fs/nfs/cache_lib.c

index c98b439332fcf913bcc4dfb4e34242dfed5c70a0..dded2636811182497c5c78d24f51e81577d7aad5 100644 (file)
--- a/fs/nfs/cache_lib.c
+++ b/fs/nfs/cache_lib.c
@@ -13,6 +13,7 @@
  #include <linux/slab.h>
  #include <linux/sunrpc/cache.h>
  #include <linux/sunrpc/rpc_pipe_fs.h>
+#include <net/net_namespace.h>
  
  #include "cache_lib.h"
  
@@ -111,30 +112,54 @@ int nfs_cache_wait_for_upcall(struct nfs_cache_defer_req *dreq)
         return 0;
  }
  
-int nfs_cache_register(struct cache_detail *cd)
+int nfs_cache_register_sb(struct super_block *sb, struct cache_detail *cd)
  {
-       struct vfsmount *mnt;
-       struct path path;
         int ret;
+       struct dentry *dir;
  
-       mnt = rpc_get_mount();
-       if (IS_ERR(mnt))
-               return PTR_ERR(mnt);
-       ret = vfs_path_lookup(mnt->mnt_root, mnt, "/cache", 0, &path);
-       if (ret)
-               goto err;
-       ret = sunrpc_cache_register_pipefs(path.dentry, cd->name, 0600, cd);
-       path_put(&path);
-       if (!ret)
-               return ret;
-err:
-       rpc_put_mount();
+       dir = rpc_d_lookup_sb(sb, "cache");
+       BUG_ON(dir == NULL);
+       ret = sunrpc_cache_register_pipefs(dir, cd->name, 0600, cd);
+       dput(dir);
         return ret;
  }
  
-void nfs_cache_unregister(struct cache_detail *cd)
+int nfs_cache_register_net(struct net *net, struct cache_detail *cd)
  {
-       sunrpc_cache_unregister_pipefs(cd);
-       rpc_put_mount();
+       struct super_block *pipefs_sb;
+       int ret = 0;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               ret = nfs_cache_register_sb(pipefs_sb, cd);
+               rpc_put_sb_net(net);
+       }
+       return ret;
+}
+
+void nfs_cache_unregister_sb(struct super_block *sb, struct cache_detail *cd)
+{
+       if (cd->u.pipefs.dir)
+               sunrpc_cache_unregister_pipefs(cd);
+}
+
+void nfs_cache_unregister_net(struct net *net, struct cache_detail *cd)
+{
+       struct super_block *pipefs_sb;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               nfs_cache_unregister_sb(pipefs_sb, cd);
+               rpc_put_sb_net(net);
+       }
+}
+
+void nfs_cache_init(struct cache_detail *cd)
+{
+       sunrpc_init_cache_detail(cd);
  }
  
+void nfs_cache_destroy(struct cache_detail *cd)
+{
+       sunrpc_destroy_cache_detail(cd);
+}
diff --git a/fs/nfs/cache_lib.h b/fs/nfs/cache_lib.h

index 7cf6cafcc007d8a5350aee6eae30647dceaab802..317db95e37f80375b371130afd58cb31f39161ed 100644 (file)
--- a/fs/nfs/cache_lib.h
+++ b/fs/nfs/cache_lib.h
@@ -23,5 +23,11 @@ extern struct nfs_cache_defer_req *nfs_cache_defer_req_alloc(void);
  extern void nfs_cache_defer_req_put(struct nfs_cache_defer_req *dreq);
  extern int nfs_cache_wait_for_upcall(struct nfs_cache_defer_req *dreq);
  
-extern int nfs_cache_register(struct cache_detail *cd);
-extern void nfs_cache_unregister(struct cache_detail *cd);
+extern void nfs_cache_init(struct cache_detail *cd);
+extern void nfs_cache_destroy(struct cache_detail *cd);
+extern int nfs_cache_register_net(struct net *net, struct cache_detail *cd);
+extern void nfs_cache_unregister_net(struct net *net, struct cache_detail *cd);
+extern int nfs_cache_register_sb(struct super_block *sb,
+                                struct cache_detail *cd);
+extern void nfs_cache_unregister_sb(struct super_block *sb,
+                                   struct cache_detail *cd);
diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c

index 516f3375e067d584aa26b2a65d82f4946b647e30..eb95f5091c1aff93930e17a829a808023edc2e12 100644 (file)
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -85,7 +85,7 @@ nfs4_callback_svc(void *vrqstp)
                 }
                 if (err < 0) {
                         if (err != preverr) {
-                               printk(KERN_WARNING "%s: unexpected error "
+                               printk(KERN_WARNING "NFS: %s: unexpected error "
                                         "from svc_recv (%d)\n", __func__, err);
                                 preverr = err;
                         }
@@ -101,12 +101,12 @@ nfs4_callback_svc(void *vrqstp)
  /*
   * Prepare to bring up the NFSv4 callback service
   */
-struct svc_rqst *
-nfs4_callback_up(struct svc_serv *serv)
+static struct svc_rqst *
+nfs4_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
  {
         int ret;
  
-       ret = svc_create_xprt(serv, "tcp", &init_net, PF_INET,
+       ret = svc_create_xprt(serv, "tcp", xprt->xprt_net, PF_INET,
                                 nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
         if (ret <= 0)
                 goto out_err;
@@ -114,7 +114,7 @@ nfs4_callback_up(struct svc_serv *serv)
         dprintk("NFS: Callback listener port = %u (af %u)\n",
                         nfs_callback_tcpport, PF_INET);
  
-       ret = svc_create_xprt(serv, "tcp", &init_net, PF_INET6,
+       ret = svc_create_xprt(serv, "tcp", xprt->xprt_net, PF_INET6,
                                 nfs_callback_set_tcpport, SVC_SOCK_ANONYMOUS);
         if (ret > 0) {
                 nfs_callback_tcpport6 = ret;
@@ -172,7 +172,7 @@ nfs41_callback_svc(void *vrqstp)
  /*
   * Bring up the NFSv4.1 callback service
   */
-struct svc_rqst *
+static struct svc_rqst *
  nfs41_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
  {
         struct svc_rqst *rqstp;
@@ -183,7 +183,7 @@ nfs41_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
          * fore channel connection.
          * Returns the input port (0) and sets the svc_serv bc_xprt on success
          */
-       ret = svc_create_xprt(serv, "tcp-bc", &init_net, PF_INET, 0,
+       ret = svc_create_xprt(serv, "tcp-bc", xprt->xprt_net, PF_INET, 0,
                               SVC_SOCK_ANONYMOUS);
         if (ret < 0) {
                 rqstp = ERR_PTR(ret);
@@ -269,7 +269,7 @@ int nfs_callback_up(u32 minorversion, struct rpc_xprt *xprt)
                                         serv, xprt, &rqstp, &callback_svc);
         if (!minorversion_setup) {
                 /* v4.0 callback setup */
-               rqstp = nfs4_callback_up(serv);
+               rqstp = nfs4_callback_up(serv, xprt);
                 callback_svc = nfs4_callback_svc;
         }
  
@@ -332,7 +332,6 @@ void nfs_callback_down(int minorversion)
  int
  check_gss_callback_principal(struct nfs_client *clp, struct svc_rqst *rqstp)
  {
-       struct rpc_clnt *r = clp->cl_rpcclient;
         char *p = svc_gss_principal(rqstp);
  
         if (rqstp->rq_authop->flavour != RPC_AUTH_GSS)
@@ -353,7 +352,7 @@ check_gss_callback_principal(struct nfs_client *clp, struct svc_rqst *rqstp)
         if (memcmp(p, "nfs@", 4) != 0)
                 return 0;
         p += 4;
-       if (strcmp(p, r->cl_server) != 0)
+       if (strcmp(p, clp->cl_hostname) != 0)
                 return 0;
         return 1;
  }
diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h

index c89d3b9e483c463cb1b9232e4b97520a7b7e1eaf..a5527c90a5aae67a67e2320ffa9d2dea1b00d4d3 100644 (file)
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -38,7 +38,8 @@ enum nfs4_callback_opnum {
  struct cb_process_state {
         __be32                  drc_status;
         struct nfs_client       *clp;
-       int                     slotid;
+       u32                     slotid;
+       struct net              *net;
  };
  
  struct cb_compound_hdr_arg {
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c

index 54cea8ad5a76ff6f8796030c4f42ad7f70d12ca6..1b5d809a105e42d992344aad78fb57526c2af8fa 100644 (file)
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -8,6 +8,7 @@
  #include <linux/nfs4.h>
  #include <linux/nfs_fs.h>
  #include <linux/slab.h>
+#include <linux/rcupdate.h>
  #include "nfs4_fs.h"
  #include "callback.h"
  #include "delegation.h"
@@ -33,7 +34,7 @@ __be32 nfs4_callback_getattr(struct cb_getattrargs *args,
         res->bitmap[0] = res->bitmap[1] = 0;
         res->status = htonl(NFS4ERR_BADHANDLE);
  
-       dprintk("NFS: GETATTR callback request from %s\n",
+       dprintk_rcu("NFS: GETATTR callback request from %s\n",
                 rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));
  
         inode = nfs_delegation_find_inode(cps->clp, &args->fh);
@@ -73,7 +74,7 @@ __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy,
         if (!cps->clp) /* Always set for v4.0. Set in cb_sequence for v4.1 */
                 goto out;
  
-       dprintk("NFS: RECALL callback request from %s\n",
+       dprintk_rcu("NFS: RECALL callback request from %s\n",
                 rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));
  
         res = htonl(NFS4ERR_BADHANDLE);
@@ -86,8 +87,7 @@ __be32 nfs4_callback_recall(struct cb_recallargs *args, void *dummy,
                 res = 0;
                 break;
         case -ENOENT:
-               if (res != 0)
-                       res = htonl(NFS4ERR_BAD_STATEID);
+               res = htonl(NFS4ERR_BAD_STATEID);
                 break;
         default:
                 res = htonl(NFS4ERR_RESOURCE);
@@ -98,52 +98,64 @@ out:
         return res;
  }
  
-int nfs4_validate_delegation_stateid(struct nfs_delegation *delegation, const nfs4_stateid *stateid)
-{
-       if (delegation == NULL || memcmp(delegation->stateid.data, stateid->data,
-                                        sizeof(delegation->stateid.data)) != 0)
-               return 0;
-       return 1;
-}
-
  #if defined(CONFIG_NFS_V4_1)
  
-static u32 initiate_file_draining(struct nfs_client *clp,
-                                 struct cb_layoutrecallargs *args)
+/*
+ * Lookup a layout by filehandle.
+ *
+ * Note: gets a refcount on the layout hdr and on its respective inode.
+ * Caller must put the layout hdr and the inode.
+ *
+ * TODO: keep track of all layouts (and delegations) in a hash table
+ * hashed by filehandle.
+ */
+static struct pnfs_layout_hdr * get_layout_by_fh_locked(struct nfs_client *clp, struct nfs_fh *fh)
  {
         struct nfs_server *server;
-       struct pnfs_layout_hdr *lo;
         struct inode *ino;
-       bool found = false;
-       u32 rv = NFS4ERR_NOMATCHING_LAYOUT;
-       LIST_HEAD(free_me_list);
+       struct pnfs_layout_hdr *lo;
  
-       spin_lock(&clp->cl_lock);
-       rcu_read_lock();
         list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
                 list_for_each_entry(lo, &server->layouts, plh_layouts) {
-                       if (nfs_compare_fh(&args->cbl_fh,
-                                          &NFS_I(lo->plh_inode)->fh))
+                       if (nfs_compare_fh(fh, &NFS_I(lo->plh_inode)->fh))
                                 continue;
                         ino = igrab(lo->plh_inode);
                         if (!ino)
                                 continue;
-                       found = true;
-                       /* Without this, layout can be freed as soon
-                        * as we release cl_lock.
-                        */
                         get_layout_hdr(lo);
-                       break;
+                       return lo;
                 }
-               if (found)
-                       break;
         }
+
+       return NULL;
+}
+
+static struct pnfs_layout_hdr * get_layout_by_fh(struct nfs_client *clp, struct nfs_fh *fh)
+{
+       struct pnfs_layout_hdr *lo;
+
+       spin_lock(&clp->cl_lock);
+       rcu_read_lock();
+       lo = get_layout_by_fh_locked(clp, fh);
         rcu_read_unlock();
         spin_unlock(&clp->cl_lock);
  
-       if (!found)
+       return lo;
+}
+
+static u32 initiate_file_draining(struct nfs_client *clp,
+                                 struct cb_layoutrecallargs *args)
+{
+       struct inode *ino;
+       struct pnfs_layout_hdr *lo;
+       u32 rv = NFS4ERR_NOMATCHING_LAYOUT;
+       LIST_HEAD(free_me_list);
+
+       lo = get_layout_by_fh(clp, &args->cbl_fh);
+       if (!lo)
                 return NFS4ERR_NOMATCHING_LAYOUT;
  
+       ino = lo->plh_inode;
         spin_lock(&ino->i_lock);
         if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags) ||
             mark_matching_lsegs_invalid(lo, &free_me_list,
@@ -213,17 +225,13 @@ static u32 initiate_bulk_draining(struct nfs_client *clp,
  static u32 do_callback_layoutrecall(struct nfs_client *clp,
                                     struct cb_layoutrecallargs *args)
  {
-       u32 res = NFS4ERR_DELAY;
+       u32 res;
  
         dprintk("%s enter, type=%i\n", __func__, args->cbl_recall_type);
-       if (test_and_set_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state))
-               goto out;
         if (args->cbl_recall_type == RETURN_FILE)
                 res = initiate_file_draining(clp, args);
         else
                 res = initiate_bulk_draining(clp, args);
-       clear_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state);
-out:
         dprintk("%s returning %i\n", __func__, res);
         return res;
  
@@ -303,21 +311,6 @@ out:
         return res;
  }
  
-int nfs41_validate_delegation_stateid(struct nfs_delegation *delegation, const nfs4_stateid *stateid)
-{
-       if (delegation == NULL)
-               return 0;
-
-       if (stateid->stateid.seqid != 0)
-               return 0;
-       if (memcmp(&delegation->stateid.stateid.other,
-                  &stateid->stateid.other,
-                  NFS4_STATEID_OTHER_SIZE))
-               return 0;
-
-       return 1;
-}
-
  /*
   * Validate the sequenceID sent by the server.
   * Return success if the sequenceID is one more than what we last saw on
@@ -441,7 +434,7 @@ __be32 nfs4_callback_sequence(struct cb_sequenceargs *args,
         int i;
         __be32 status = htonl(NFS4ERR_BADSESSION);
  
-       clp = nfs4_find_client_sessionid(args->csa_addr, &args->csa_sessionid);
+       clp = nfs4_find_client_sessionid(cps->net, args->csa_addr, &args->csa_sessionid);
         if (clp == NULL)
                 goto out;
  
@@ -517,7 +510,7 @@ __be32 nfs4_callback_recallany(struct cb_recallanyargs *args, void *dummy,
         if (!cps->clp) /* set in cb_sequence */
                 goto out;
  
-       dprintk("NFS: RECALL_ANY callback request from %s\n",
+       dprintk_rcu("NFS: RECALL_ANY callback request from %s\n",
                 rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR));
  
         status = cpu_to_be32(NFS4ERR_INVAL);
@@ -552,7 +545,7 @@ __be32 nfs4_callback_recallslot(struct cb_recallslotargs *args, void *dummy,
         if (!cps->clp) /* set in cb_sequence */
                 goto out;
  
-       dprintk("NFS: CB_RECALL_SLOT request from %s target max slots %d\n",
+       dprintk_rcu("NFS: CB_RECALL_SLOT request from %s target max slots %d\n",
                 rpc_peeraddr2str(cps->clp->cl_rpcclient, RPC_DISPLAY_ADDR),
                 args->crsa_target_max_slots);
  
diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c

index d50b2742f23baeb20d54c44d6919ed126faf74c8..95bfc243992c1a822041d7a205bbca23162bf91d 100644 (file)
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -9,6 +9,8 @@
  #include <linux/sunrpc/svc.h>
  #include <linux/nfs4.h>
  #include <linux/nfs_fs.h>
+#include <linux/ratelimit.h>
+#include <linux/printk.h>
  #include <linux/slab.h>
  #include <linux/sunrpc/bc_xprt.h>
  #include "nfs4_fs.h"
@@ -73,7 +75,7 @@ static __be32 *read_buf(struct xdr_stream *xdr, int nbytes)
  
         p = xdr_inline_decode(xdr, nbytes);
         if (unlikely(p == NULL))
-               printk(KERN_WARNING "NFSv4 callback reply buffer overflowed!\n");
+               printk(KERN_WARNING "NFS: NFSv4 callback reply buffer overflowed!\n");
         return p;
  }
  
@@ -138,10 +140,10 @@ static __be32 decode_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid)
  {
         __be32 *p;
  
-       p = read_buf(xdr, 16);
+       p = read_buf(xdr, NFS4_STATEID_SIZE);
         if (unlikely(p == NULL))
                 return htonl(NFS4ERR_RESOURCE);
-       memcpy(stateid->data, p, 16);
+       memcpy(stateid, p, NFS4_STATEID_SIZE);
         return 0;
  }
  
@@ -155,7 +157,7 @@ static __be32 decode_compound_hdr_arg(struct xdr_stream *xdr, struct cb_compound
                 return status;
         /* We do not like overly long tags! */
         if (hdr->taglen > CB_OP_TAGLEN_MAXSZ - 12) {
-               printk("NFSv4 CALLBACK %s: client sent tag of length %u\n",
+               printk("NFS: NFSv4 CALLBACK %s: client sent tag of length %u\n",
                                 __func__, hdr->taglen);
                 return htonl(NFS4ERR_RESOURCE);
         }
@@ -167,7 +169,7 @@ static __be32 decode_compound_hdr_arg(struct xdr_stream *xdr, struct cb_compound
         if (hdr->minorversion <= 1) {
                 hdr->cb_ident = ntohl(*p++); /* ignored by v4.1 */
         } else {
-               printk(KERN_WARNING "%s: NFSv4 server callback with "
+               pr_warn_ratelimited("NFS: %s: NFSv4 server callback with "
                         "illegal minor version %u!\n",
                         __func__, hdr->minorversion);
                 return htonl(NFS4ERR_MINOR_VERS_MISMATCH);
@@ -759,14 +761,14 @@ static void nfs4_callback_free_slot(struct nfs4_session *session)
          * Let the state manager know callback processing done.
          * A single slot, so highest used slotid is either 0 or -1
          */
-       tbl->highest_used_slotid = -1;
+       tbl->highest_used_slotid = NFS4_NO_SLOT;
         nfs4_check_drain_bc_complete(session);
         spin_unlock(&tbl->slot_tbl_lock);
  }
  
  static void nfs4_cb_free_slot(struct cb_process_state *cps)
  {
-       if (cps->slotid != -1)
+       if (cps->slotid != NFS4_NO_SLOT)
                 nfs4_callback_free_slot(cps->clp->cl_session);
  }
  
@@ -860,7 +862,8 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp, void *argp, void *r
         struct cb_process_state cps = {
                 .drc_status = 0,
                 .clp = NULL,
-               .slotid = -1,
+               .slotid = NFS4_NO_SLOT,
+               .net = rqstp->rq_xprt->xpt_net,
         };
         unsigned int nops = 0;
  
@@ -876,7 +879,7 @@ static __be32 nfs4_callback_compound(struct svc_rqst *rqstp, void *argp, void *r
                 return rpc_garbage_args;
  
         if (hdr_arg.minorversion == 0) {
-               cps.clp = nfs4_find_client_ident(hdr_arg.cb_ident);
+               cps.clp = nfs4_find_client_ident(rqstp->rq_xprt->xpt_net, hdr_arg.cb_ident);
                 if (!cps.clp || !check_gss_callback_principal(cps.clp, rqstp))
                         return rpc_drop_reply;
         }
diff --git a/fs/nfs/client.c b/fs/nfs/client.c

index d4f772ebd1efd7f86d4da5f9c83f96da8a76c1bf..4a108a0a2a6085e75c4cb42a2564d32013e28bce 100644 (file)
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -40,6 +40,8 @@
  #include <net/ipv6.h>
  #include <linux/nfs_xdr.h>
  #include <linux/sunrpc/bc_xprt.h>
+#include <linux/nsproxy.h>
+#include <linux/pid_namespace.h>
  
  #include <asm/system.h>
  
@@ -50,15 +52,12 @@
  #include "internal.h"
  #include "fscache.h"
  #include "pnfs.h"
+#include "netns.h"
  
  #define NFSDBG_FACILITY                NFSDBG_CLIENT
  
-static DEFINE_SPINLOCK(nfs_client_lock);
-static LIST_HEAD(nfs_client_list);
-static LIST_HEAD(nfs_volume_list);
  static DECLARE_WAIT_QUEUE_HEAD(nfs_client_active_wq);
  #ifdef CONFIG_NFS_V4
-static DEFINE_IDR(cb_ident_idr); /* Protected by nfs_client_lock */
  
  /*
   * Get a unique NFSv4.0 callback identifier which will be used
@@ -67,15 +66,16 @@ static DEFINE_IDR(cb_ident_idr); /* Protected by nfs_client_lock */
  static int nfs_get_cb_ident_idr(struct nfs_client *clp, int minorversion)
  {
         int ret = 0;
+       struct nfs_net *nn = net_generic(clp->net, nfs_net_id);
  
         if (clp->rpc_ops->version != 4 || minorversion != 0)
                 return ret;
  retry:
-       if (!idr_pre_get(&cb_ident_idr, GFP_KERNEL))
+       if (!idr_pre_get(&nn->cb_ident_idr, GFP_KERNEL))
                 return -ENOMEM;
-       spin_lock(&nfs_client_lock);
-       ret = idr_get_new(&cb_ident_idr, clp, &clp->cl_cb_ident);
-       spin_unlock(&nfs_client_lock);
+       spin_lock(&nn->nfs_client_lock);
+       ret = idr_get_new(&nn->cb_ident_idr, clp, &clp->cl_cb_ident);
+       spin_unlock(&nn->nfs_client_lock);
         if (ret == -EAGAIN)
                 goto retry;
         return ret;
@@ -90,7 +90,7 @@ static bool nfs4_disable_idmapping = true;
  /*
   * RPC cruft for NFS
   */
-static struct rpc_version *nfs_version[5] = {
+static const struct rpc_version *nfs_version[5] = {
         [2]                     = &nfs_version2,
  #ifdef CONFIG_NFS_V3
         [3]                     = &nfs_version3,
@@ -100,7 +100,7 @@ static struct rpc_version *nfs_version[5] = {
  #endif
  };
  
-struct rpc_program nfs_program = {
+const struct rpc_program nfs_program = {
         .name                   = "nfs",
         .number                 = NFS_PROGRAM,
         .nrvers                 = ARRAY_SIZE(nfs_version),
@@ -116,11 +116,11 @@ struct rpc_stat nfs_rpcstat = {
  
  #ifdef CONFIG_NFS_V3_ACL
  static struct rpc_stat         nfsacl_rpcstat = { &nfsacl_program };
-static struct rpc_version *    nfsacl_version[] = {
+static const struct rpc_version *nfsacl_version[] = {
         [3]                     = &nfsacl_version3,
  };
  
-struct rpc_program             nfsacl_program = {
+const struct rpc_program nfsacl_program = {
         .name                   = "nfsacl",
         .number                 = NFS_ACL_PROGRAM,
         .nrvers                 = ARRAY_SIZE(nfsacl_version),
@@ -136,6 +136,7 @@ struct nfs_client_initdata {
         const struct nfs_rpc_ops *rpc_ops;
         int proto;
         u32 minorversion;
+       struct net *net;
  };
  
  /*
@@ -172,6 +173,7 @@ static struct nfs_client *nfs_alloc_client(const struct nfs_client_initdata *cl_
         clp->cl_rpcclient = ERR_PTR(-EINVAL);
  
         clp->cl_proto = cl_init->proto;
+       clp->net = get_net(cl_init->net);
  
  #ifdef CONFIG_NFS_V4
         err = nfs_get_cb_ident_idr(clp, cl_init->minorversion);
@@ -203,8 +205,11 @@ error_0:
  #ifdef CONFIG_NFS_V4_1
  static void nfs4_shutdown_session(struct nfs_client *clp)
  {
-       if (nfs4_has_session(clp))
+       if (nfs4_has_session(clp)) {
+               nfs4_deviceid_purge_client(clp);
                 nfs4_destroy_session(clp->cl_session);
+       }
+
  }
  #else /* CONFIG_NFS_V4_1 */
  static void nfs4_shutdown_session(struct nfs_client *clp)
@@ -234,16 +239,20 @@ static void nfs4_shutdown_client(struct nfs_client *clp)
  }
  
  /* idr_remove_all is not needed as all id's are removed by nfs_put_client */
-void nfs_cleanup_cb_ident_idr(void)
+void nfs_cleanup_cb_ident_idr(struct net *net)
  {
-       idr_destroy(&cb_ident_idr);
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+
+       idr_destroy(&nn->cb_ident_idr);
  }
  
  /* nfs_client_lock held */
  static void nfs_cb_idr_remove_locked(struct nfs_client *clp)
  {
+       struct nfs_net *nn = net_generic(clp->net, nfs_net_id);
+
         if (clp->cl_cb_ident)
-               idr_remove(&cb_ident_idr, clp->cl_cb_ident);
+               idr_remove(&nn->cb_ident_idr, clp->cl_cb_ident);
  }
  
  static void pnfs_init_server(struct nfs_server *server)
@@ -261,7 +270,7 @@ static void nfs4_shutdown_client(struct nfs_client *clp)
  {
  }
  
-void nfs_cleanup_cb_ident_idr(void)
+void nfs_cleanup_cb_ident_idr(struct net *net)
  {
  }
  
@@ -293,10 +302,10 @@ static void nfs_free_client(struct nfs_client *clp)
         if (clp->cl_machine_cred != NULL)
                 put_rpccred(clp->cl_machine_cred);
  
-       nfs4_deviceid_purge_client(clp);
-
+       put_net(clp->net);
         kfree(clp->cl_hostname);
         kfree(clp->server_scope);
+       kfree(clp->impl_id);
         kfree(clp);
  
         dprintk("<-- nfs_free_client()\n");
@@ -307,15 +316,18 @@ static void nfs_free_client(struct nfs_client *clp)
   */
  void nfs_put_client(struct nfs_client *clp)
  {
+       struct nfs_net *nn;
+
         if (!clp)
                 return;
  
         dprintk("--> nfs_put_client({%d})\n", atomic_read(&clp->cl_count));
+       nn = net_generic(clp->net, nfs_net_id);
  
-       if (atomic_dec_and_lock(&clp->cl_count, &nfs_client_lock)) {
+       if (atomic_dec_and_lock(&clp->cl_count, &nn->nfs_client_lock)) {
                 list_del(&clp->cl_share_link);
                 nfs_cb_idr_remove_locked(clp);
-               spin_unlock(&nfs_client_lock);
+               spin_unlock(&nn->nfs_client_lock);
  
                 BUG_ON(!list_empty(&clp->cl_superblocks));
  
@@ -393,6 +405,7 @@ static int nfs_sockaddr_cmp_ip4(const struct sockaddr *sa1,
                 (sin1->sin_port == sin2->sin_port);
  }
  
+#if defined(CONFIG_NFS_V4_1)
  /*
   * Test if two socket addresses represent the same actual socket,
   * by comparing (only) relevant fields, excluding the port number.
@@ -411,6 +424,7 @@ static int nfs_sockaddr_match_ipaddr(const struct sockaddr *sa1,
         }
         return 0;
  }
+#endif /* CONFIG_NFS_V4_1 */
  
  /*
   * Test if two socket addresses represent the same actual socket,
@@ -431,10 +445,10 @@ static int nfs_sockaddr_cmp(const struct sockaddr *sa1,
         return 0;
  }
  
+#if defined(CONFIG_NFS_V4_1)
  /* Common match routine for v4.0 and v4.1 callback services */
-bool
-nfs4_cb_match_client(const struct sockaddr *addr, struct nfs_client *clp,
-                    u32 minorversion)
+static bool nfs4_cb_match_client(const struct sockaddr *addr,
+               struct nfs_client *clp, u32 minorversion)
  {
         struct sockaddr *clap = (struct sockaddr *)&clp->cl_addr;
  
@@ -454,6 +468,7 @@ nfs4_cb_match_client(const struct sockaddr *addr, struct nfs_client *clp,
  
         return true;
  }
+#endif /* CONFIG_NFS_V4_1 */
  
  /*
   * Find an nfs_client on the list that matches the initialisation data
@@ -463,8 +478,9 @@ static struct nfs_client *nfs_match_client(const struct nfs_client_initdata *dat
  {
         struct nfs_client *clp;
         const struct sockaddr *sap = data->addr;
+       struct nfs_net *nn = net_generic(data->net, nfs_net_id);
  
-       list_for_each_entry(clp, &nfs_client_list, cl_share_link) {
+       list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) {
                 const struct sockaddr *clap = (struct sockaddr *)&clp->cl_addr;
                 /* Don't match clients that failed to initialise properly */
                 if (clp->cl_cons_state < 0)
@@ -502,13 +518,14 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
  {
         struct nfs_client *clp, *new = NULL;
         int error;
+       struct nfs_net *nn = net_generic(cl_init->net, nfs_net_id);
  
         dprintk("--> nfs_get_client(%s,v%u)\n",
                 cl_init->hostname ?: "", cl_init->rpc_ops->version);
  
         /* see if the client already exists */
         do {
-               spin_lock(&nfs_client_lock);
+               spin_lock(&nn->nfs_client_lock);
  
                 clp = nfs_match_client(cl_init);
                 if (clp)
@@ -516,7 +533,7 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
                 if (new)
                         goto install_client;
  
-               spin_unlock(&nfs_client_lock);
+               spin_unlock(&nn->nfs_client_lock);
  
                 new = nfs_alloc_client(cl_init);
         } while (!IS_ERR(new));
@@ -527,8 +544,8 @@ nfs_get_client(const struct nfs_client_initdata *cl_init,
         /* install a new client and return with it unready */
  install_client:
         clp = new;
-       list_add(&clp->cl_share_link, &nfs_client_list);
-       spin_unlock(&nfs_client_lock);
+       list_add(&clp->cl_share_link, &nn->nfs_client_list);
+       spin_unlock(&nn->nfs_client_lock);
  
         error = cl_init->rpc_ops->init_client(clp, timeparms, ip_addr,
                                               authflavour, noresvport);
@@ -543,7 +560,7 @@ install_client:
          * - make sure it's ready before returning
          */
  found_client:
-       spin_unlock(&nfs_client_lock);
+       spin_unlock(&nn->nfs_client_lock);
  
         if (new)
                 nfs_free_client(new);
@@ -643,7 +660,7 @@ static int nfs_create_rpc_client(struct nfs_client *clp,
  {
         struct rpc_clnt         *clnt = NULL;
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = clp->net,
                 .protocol       = clp->cl_proto,
                 .address        = (struct sockaddr *)&clp->cl_addr,
                 .addrsize       = clp->cl_addrlen,
@@ -697,6 +714,7 @@ static int nfs_start_lockd(struct nfs_server *server)
                 .nfs_version    = clp->rpc_ops->version,
                 .noresvport     = server->flags & NFS_MOUNT_NORESVPORT ?
                                         1 : 0,
+               .net            = clp->net,
         };
  
         if (nlm_init.nfs_version > 3)
@@ -832,6 +850,7 @@ static int nfs_init_server(struct nfs_server *server,
                 .addrlen = data->nfs_server.addrlen,
                 .rpc_ops = &nfs_v2_clientops,
                 .proto = data->nfs_server.protocol,
+               .net = data->net,
         };
         struct rpc_timeout timeparms;
         struct nfs_client *clp;
@@ -1030,25 +1049,30 @@ static void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_serve
  static void nfs_server_insert_lists(struct nfs_server *server)
  {
         struct nfs_client *clp = server->nfs_client;
+       struct nfs_net *nn = net_generic(clp->net, nfs_net_id);
  
-       spin_lock(&nfs_client_lock);
+       spin_lock(&nn->nfs_client_lock);
         list_add_tail_rcu(&server->client_link, &clp->cl_superblocks);
-       list_add_tail(&server->master_link, &nfs_volume_list);
+       list_add_tail(&server->master_link, &nn->nfs_volume_list);
         clear_bit(NFS_CS_STOP_RENEW, &clp->cl_res_state);
-       spin_unlock(&nfs_client_lock);
+       spin_unlock(&nn->nfs_client_lock);
  
  }
  
  static void nfs_server_remove_lists(struct nfs_server *server)
  {
         struct nfs_client *clp = server->nfs_client;
+       struct nfs_net *nn;
  
-       spin_lock(&nfs_client_lock);
+       if (clp == NULL)
+               return;
+       nn = net_generic(clp->net, nfs_net_id);
+       spin_lock(&nn->nfs_client_lock);
         list_del_rcu(&server->client_link);
-       if (clp && list_empty(&clp->cl_superblocks))
+       if (list_empty(&clp->cl_superblocks))
                 set_bit(NFS_CS_STOP_RENEW, &clp->cl_res_state);
         list_del(&server->master_link);
-       spin_unlock(&nfs_client_lock);
+       spin_unlock(&nn->nfs_client_lock);
  
         synchronize_rcu();
  }
@@ -1087,6 +1111,8 @@ static struct nfs_server *nfs_alloc_server(void)
                 return NULL;
         }
  
+       ida_init(&server->openowner_id);
+       ida_init(&server->lockowner_id);
         pnfs_init_server(server);
  
         return server;
@@ -1112,6 +1138,8 @@ void nfs_free_server(struct nfs_server *server)
  
         nfs_put_client(server->nfs_client);
  
+       ida_destroy(&server->lockowner_id);
+       ida_destroy(&server->openowner_id);
         nfs_free_iostats(server->io_stats);
         bdi_destroy(&server->backing_dev_info);
         kfree(server);
@@ -1187,48 +1215,22 @@ error:
  }
  
  #ifdef CONFIG_NFS_V4
-/*
- * NFSv4.0 callback thread helper
- *
- * Find a client by IP address, protocol version, and minorversion
- *
- * Called from the pg_authenticate method. The callback identifier
- * is not used as it has not been decoded.
- *
- * Returns NULL if no such client
- */
-struct nfs_client *
-nfs4_find_client_no_ident(const struct sockaddr *addr)
-{
-       struct nfs_client *clp;
-
-       spin_lock(&nfs_client_lock);
-       list_for_each_entry(clp, &nfs_client_list, cl_share_link) {
-               if (nfs4_cb_match_client(addr, clp, 0) == false)
-                       continue;
-               atomic_inc(&clp->cl_count);
-               spin_unlock(&nfs_client_lock);
-               return clp;
-       }
-       spin_unlock(&nfs_client_lock);
-       return NULL;
-}
-
  /*
   * NFSv4.0 callback thread helper
   *
   * Find a client by callback identifier
   */
  struct nfs_client *
-nfs4_find_client_ident(int cb_ident)
+nfs4_find_client_ident(struct net *net, int cb_ident)
  {
         struct nfs_client *clp;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
  
-       spin_lock(&nfs_client_lock);
-       clp = idr_find(&cb_ident_idr, cb_ident);
+       spin_lock(&nn->nfs_client_lock);
+       clp = idr_find(&nn->cb_ident_idr, cb_ident);
         if (clp)
                 atomic_inc(&clp->cl_count);
-       spin_unlock(&nfs_client_lock);
+       spin_unlock(&nn->nfs_client_lock);
         return clp;
  }
  
@@ -1241,13 +1243,14 @@ nfs4_find_client_ident(int cb_ident)
   * Returns NULL if no such client
   */
  struct nfs_client *
-nfs4_find_client_sessionid(const struct sockaddr *addr,
+nfs4_find_client_sessionid(struct net *net, const struct sockaddr *addr,
                            struct nfs4_sessionid *sid)
  {
         struct nfs_client *clp;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
  
-       spin_lock(&nfs_client_lock);
-       list_for_each_entry(clp, &nfs_client_list, cl_share_link) {
+       spin_lock(&nn->nfs_client_lock);
+       list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) {
                 if (nfs4_cb_match_client(addr, clp, 1) == false)
                         continue;
  
@@ -1260,17 +1263,17 @@ nfs4_find_client_sessionid(const struct sockaddr *addr,
                         continue;
  
                 atomic_inc(&clp->cl_count);
-               spin_unlock(&nfs_client_lock);
+               spin_unlock(&nn->nfs_client_lock);
                 return clp;
         }
-       spin_unlock(&nfs_client_lock);
+       spin_unlock(&nn->nfs_client_lock);
         return NULL;
  }
  
  #else /* CONFIG_NFS_V4_1 */
  
  struct nfs_client *
-nfs4_find_client_sessionid(const struct sockaddr *addr,
+nfs4_find_client_sessionid(struct net *net, const struct sockaddr *addr,
                            struct nfs4_sessionid *sid)
  {
         return NULL;
@@ -1285,16 +1288,18 @@ static int nfs4_init_callback(struct nfs_client *clp)
         int error;
  
         if (clp->rpc_ops->version == 4) {
+               struct rpc_xprt *xprt;
+
+               xprt = rcu_dereference_raw(clp->cl_rpcclient->cl_xprt);
+
                 if (nfs4_has_session(clp)) {
-                       error = xprt_setup_backchannel(
-                                               clp->cl_rpcclient->cl_xprt,
+                       error = xprt_setup_backchannel(xprt,
                                                 NFS41_BC_MIN_CALLBACKS);
                         if (error < 0)
                                 return error;
                 }
  
-               error = nfs_callback_up(clp->cl_mvops->minor_version,
-                                       clp->cl_rpcclient->cl_xprt);
+               error = nfs_callback_up(clp->cl_mvops->minor_version, xprt);
                 if (error < 0) {
                         dprintk("%s: failed to start callback. Error = %d\n",
                                 __func__, error);
@@ -1345,6 +1350,7 @@ int nfs4_init_client(struct nfs_client *clp,
                      rpc_authflavor_t authflavour,
                      int noresvport)
  {
+       char buf[INET6_ADDRSTRLEN + 1];
         int error;
  
         if (clp->cl_cons_state == NFS_CS_READY) {
@@ -1360,6 +1366,20 @@ int nfs4_init_client(struct nfs_client *clp,
                                       1, noresvport);
         if (error < 0)
                 goto error;
+
+       /* If no clientaddr= option was specified, find a usable cb address */
+       if (ip_addr == NULL) {
+               struct sockaddr_storage cb_addr;
+               struct sockaddr *sap = (struct sockaddr *)&cb_addr;
+
+               error = rpc_localaddr(clp->cl_rpcclient, sap, sizeof(cb_addr));
+               if (error < 0)
+                       goto error;
+               error = rpc_ntop(sap, buf, sizeof(buf));
+               if (error < 0)
+                       goto error;
+               ip_addr = (const char *)buf;
+       }
         strlcpy(clp->cl_ipaddr, ip_addr, sizeof(clp->cl_ipaddr));
  
         error = nfs_idmap_new(clp);
@@ -1394,7 +1414,7 @@ static int nfs4_set_client(struct nfs_server *server,
                 const char *ip_addr,
                 rpc_authflavor_t authflavour,
                 int proto, const struct rpc_timeout *timeparms,
-               u32 minorversion)
+               u32 minorversion, struct net *net)
  {
         struct nfs_client_initdata cl_init = {
                 .hostname = hostname,
@@ -1403,6 +1423,7 @@ static int nfs4_set_client(struct nfs_server *server,
                 .rpc_ops = &nfs_v4_clientops,
                 .proto = proto,
                 .minorversion = minorversion,
+               .net = net,
         };
         struct nfs_client *clp;
         int error;
@@ -1454,6 +1475,7 @@ struct nfs_client *nfs4_set_ds_client(struct nfs_client* mds_clp,
                 .rpc_ops = &nfs_v4_clientops,
                 .proto = ds_proto,
                 .minorversion = mds_clp->cl_minorversion,
+               .net = mds_clp->net,
         };
         struct rpc_timeout ds_timeout = {
                 .to_initval = 15 * HZ,
@@ -1581,7 +1603,8 @@ static int nfs4_init_server(struct nfs_server *server,
                         data->auth_flavors[0],
                         data->nfs_server.protocol,
                         &timeparms,
-                       data->minorversion);
+                       data->minorversion,
+                       data->net);
         if (error < 0)
                 goto error;
  
@@ -1676,9 +1699,10 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data,
                                 data->addrlen,
                                 parent_client->cl_ipaddr,
                                 data->authflavor,
-                               parent_server->client->cl_xprt->prot,
+                               rpc_protocol(parent_server->client),
                                 parent_server->client->cl_timeout,
-                               parent_client->cl_mvops->minor_version);
+                               parent_client->cl_mvops->minor_version,
+                               parent_client->net);
         if (error < 0)
                 goto error;
  
@@ -1771,6 +1795,18 @@ out_free_server:
         return ERR_PTR(error);
  }
  
+void nfs_clients_init(struct net *net)
+{
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+
+       INIT_LIST_HEAD(&nn->nfs_client_list);
+       INIT_LIST_HEAD(&nn->nfs_volume_list);
+#ifdef CONFIG_NFS_V4
+       idr_init(&nn->cb_ident_idr);
+#endif
+       spin_lock_init(&nn->nfs_client_lock);
+}
+
  #ifdef CONFIG_PROC_FS
  static struct proc_dir_entry *proc_fs_nfs;
  
@@ -1824,13 +1860,15 @@ static int nfs_server_list_open(struct inode *inode, struct file *file)
  {
         struct seq_file *m;
         int ret;
+       struct pid_namespace *pid_ns = file->f_dentry->d_sb->s_fs_info;
+       struct net *net = pid_ns->child_reaper->nsproxy->net_ns;
  
         ret = seq_open(file, &nfs_server_list_ops);
         if (ret < 0)
                 return ret;
  
         m = file->private_data;
-       m->private = PDE(inode)->data;
+       m->private = net;
  
         return 0;
  }
@@ -1840,9 +1878,11 @@ static int nfs_server_list_open(struct inode *inode, struct file *file)
   */
  static void *nfs_server_list_start(struct seq_file *m, loff_t *_pos)
  {
+       struct nfs_net *nn = net_generic(m->private, nfs_net_id);
+
         /* lock the list against modification */
-       spin_lock(&nfs_client_lock);
-       return seq_list_start_head(&nfs_client_list, *_pos);
+       spin_lock(&nn->nfs_client_lock);
+       return seq_list_start_head(&nn->nfs_client_list, *_pos);
  }
  
  /*
@@ -1850,7 +1890,9 @@ static void *nfs_server_list_start(struct seq_file *m, loff_t *_pos)
   */
  static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos)
  {
-       return seq_list_next(v, &nfs_client_list, pos);
+       struct nfs_net *nn = net_generic(p->private, nfs_net_id);
+
+       return seq_list_next(v, &nn->nfs_client_list, pos);
  }
  
  /*
@@ -1858,7 +1900,9 @@ static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos)
   */
  static void nfs_server_list_stop(struct seq_file *p, void *v)
  {
-       spin_unlock(&nfs_client_lock);
+       struct nfs_net *nn = net_generic(p->private, nfs_net_id);
+
+       spin_unlock(&nn->nfs_client_lock);
  }
  
  /*
@@ -1867,9 +1911,10 @@ static void nfs_server_list_stop(struct seq_file *p, void *v)
  static int nfs_server_list_show(struct seq_file *m, void *v)
  {
         struct nfs_client *clp;
+       struct nfs_net *nn = net_generic(m->private, nfs_net_id);
  
         /* display header on line 1 */
-       if (v == &nfs_client_list) {
+       if (v == &nn->nfs_client_list) {
                 seq_puts(m, "NV SERVER   PORT USE HOSTNAME\n");
                 return 0;
         }
@@ -1881,12 +1926,14 @@ static int nfs_server_list_show(struct seq_file *m, void *v)
         if (clp->cl_cons_state != NFS_CS_READY)
                 return 0;
  
+       rcu_read_lock();
         seq_printf(m, "v%u %s %s %3d %s\n",
                    clp->rpc_ops->version,
                    rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR),
                    rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_PORT),
                    atomic_read(&clp->cl_count),
                    clp->cl_hostname);
+       rcu_read_unlock();
  
         return 0;
  }
@@ -1898,13 +1945,15 @@ static int nfs_volume_list_open(struct inode *inode, struct file *file)
  {
         struct seq_file *m;
         int ret;
+       struct pid_namespace *pid_ns = file->f_dentry->d_sb->s_fs_info;
+       struct net *net = pid_ns->child_reaper->nsproxy->net_ns;
  
         ret = seq_open(file, &nfs_volume_list_ops);
         if (ret < 0)
                 return ret;
  
         m = file->private_data;
-       m->private = PDE(inode)->data;
+       m->private = net;
  
         return 0;
  }
@@ -1914,9 +1963,11 @@ static int nfs_volume_list_open(struct inode *inode, struct file *file)
   */
  static void *nfs_volume_list_start(struct seq_file *m, loff_t *_pos)
  {
+       struct nfs_net *nn = net_generic(m->private, nfs_net_id);
+
         /* lock the list against modification */
-       spin_lock(&nfs_client_lock);
-       return seq_list_start_head(&nfs_volume_list, *_pos);
+       spin_lock(&nn->nfs_client_lock);
+       return seq_list_start_head(&nn->nfs_volume_list, *_pos);
  }
  
  /*
@@ -1924,7 +1975,9 @@ static void *nfs_volume_list_start(struct seq_file *m, loff_t *_pos)
   */
  static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos)
  {
-       return seq_list_next(v, &nfs_volume_list, pos);
+       struct nfs_net *nn = net_generic(p->private, nfs_net_id);
+
+       return seq_list_next(v, &nn->nfs_volume_list, pos);
  }
  
  /*
@@ -1932,7 +1985,9 @@ static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos)
   */
  static void nfs_volume_list_stop(struct seq_file *p, void *v)
  {
-       spin_unlock(&nfs_client_lock);
+       struct nfs_net *nn = net_generic(p->private, nfs_net_id);
+
+       spin_unlock(&nn->nfs_client_lock);
  }
  
  /*
@@ -1943,9 +1998,10 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
         struct nfs_server *server;
         struct nfs_client *clp;
         char dev[8], fsid[17];
+       struct nfs_net *nn = net_generic(m->private, nfs_net_id);
  
         /* display header on line 1 */
-       if (v == &nfs_volume_list) {
+       if (v == &nn->nfs_volume_list) {
                 seq_puts(m, "NV SERVER   PORT DEV     FSID              FSC\n");
                 return 0;
         }
@@ -1960,6 +2016,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
                  (unsigned long long) server->fsid.major,
                  (unsigned long long) server->fsid.minor);
  
+       rcu_read_lock();
         seq_printf(m, "v%u %s %s %-7s %-17s %s\n",
                    clp->rpc_ops->version,
                    rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_HEX_ADDR),
@@ -1967,6 +2024,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
                    dev,
                    fsid,
                    nfs_server_fscache_state(server));
+       rcu_read_unlock();
  
         return 0;
  }
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c

index 7f2654069806f011c4361900041ef6ae64f17de4..89af1d269274f3a91401f704226dec43f9c02528 100644 (file)
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -105,7 +105,7 @@ again:
                         continue;
                 if (!test_bit(NFS_DELEGATED_STATE, &state->flags))
                         continue;
-               if (memcmp(state->stateid.data, stateid->data, sizeof(state->stateid.data)) != 0)
+               if (!nfs4_stateid_match(&state->stateid, stateid))
                         continue;
                 get_nfs_open_context(ctx);
                 spin_unlock(&inode->i_lock);
@@ -139,8 +139,7 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred,
         if (delegation != NULL) {
                 spin_lock(&delegation->lock);
                 if (delegation->inode != NULL) {
-                       memcpy(delegation->stateid.data, res->delegation.data,
-                              sizeof(delegation->stateid.data));
+                       nfs4_stateid_copy(&delegation->stateid, &res->delegation);
                         delegation->type = res->delegation_type;
                         delegation->maxsize = res->maxsize;
                         oldcred = delegation->cred;
@@ -236,8 +235,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
         delegation = kmalloc(sizeof(*delegation), GFP_NOFS);
         if (delegation == NULL)
                 return -ENOMEM;
-       memcpy(delegation->stateid.data, res->delegation.data,
-                       sizeof(delegation->stateid.data));
+       nfs4_stateid_copy(&delegation->stateid, &res->delegation);
         delegation->type = res->delegation_type;
         delegation->maxsize = res->maxsize;
         delegation->change_attr = inode->i_version;
@@ -250,19 +248,22 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
         old_delegation = rcu_dereference_protected(nfsi->delegation,
                                         lockdep_is_held(&clp->cl_lock));
         if (old_delegation != NULL) {
-               if (memcmp(&delegation->stateid, &old_delegation->stateid,
-                                       sizeof(old_delegation->stateid)) == 0 &&
+               if (nfs4_stateid_match(&delegation->stateid,
+                                       &old_delegation->stateid) &&
                                 delegation->type == old_delegation->type) {
                         goto out;
                 }
                 /*
                  * Deal with broken servers that hand out two
                  * delegations for the same file.
+                * Allow for upgrades to a WRITE delegation, but
+                * nothing else.
                  */
                 dfprintk(FILE, "%s: server %s handed out "
                                 "a duplicate delegation!\n",
                                 __func__, clp->cl_hostname);
-               if (delegation->type <= old_delegation->type) {
+               if (delegation->type == old_delegation->type ||
+                   !(delegation->type & FMODE_WRITE)) {
                         freeme = delegation;
                         delegation = NULL;
                         goto out;
@@ -455,17 +456,24 @@ static void nfs_client_mark_return_all_delegation_types(struct nfs_client *clp,
         rcu_read_unlock();
  }
  
-static void nfs_client_mark_return_all_delegations(struct nfs_client *clp)
-{
-       nfs_client_mark_return_all_delegation_types(clp, FMODE_READ|FMODE_WRITE);
-}
-
  static void nfs_delegation_run_state_manager(struct nfs_client *clp)
  {
         if (test_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state))
                 nfs4_schedule_state_manager(clp);
  }
  
+void nfs_remove_bad_delegation(struct inode *inode)
+{
+       struct nfs_delegation *delegation;
+
+       delegation = nfs_detach_delegation(NFS_I(inode), NFS_SERVER(inode));
+       if (delegation) {
+               nfs_inode_find_state_and_recover(inode, &delegation->stateid);
+               nfs_free_delegation(delegation);
+       }
+}
+EXPORT_SYMBOL_GPL(nfs_remove_bad_delegation);
+
  /**
   * nfs_expire_all_delegation_types
   * @clp: client to process
@@ -488,18 +496,6 @@ void nfs_expire_all_delegations(struct nfs_client *clp)
         nfs_expire_all_delegation_types(clp, FMODE_READ|FMODE_WRITE);
  }
  
-/**
- * nfs_handle_cb_pathdown - return all delegations after NFS4ERR_CB_PATH_DOWN
- * @clp: client to process
- *
- */
-void nfs_handle_cb_pathdown(struct nfs_client *clp)
-{
-       if (clp == NULL)
-               return;
-       nfs_client_mark_return_all_delegations(clp);
-}
-
  static void nfs_mark_return_unreferenced_delegations(struct nfs_server *server)
  {
         struct nfs_delegation *delegation;
@@ -531,7 +527,7 @@ void nfs_expire_unreferenced_delegations(struct nfs_client *clp)
  /**
   * nfs_async_inode_return_delegation - asynchronously return a delegation
   * @inode: inode to process
- * @stateid: state ID information from CB_RECALL arguments
+ * @stateid: state ID information
   *
   * Returns zero on success, or a negative errno value.
   */
@@ -545,7 +541,7 @@ int nfs_async_inode_return_delegation(struct inode *inode,
         rcu_read_lock();
         delegation = rcu_dereference(NFS_I(inode)->delegation);
  
-       if (!clp->cl_mvops->validate_stateid(delegation, stateid)) {
+       if (!clp->cl_mvops->match_stateid(&delegation->stateid, stateid)) {
                 rcu_read_unlock();
                 return -ENOENT;
         }
@@ -684,21 +680,25 @@ int nfs_delegations_present(struct nfs_client *clp)
   * nfs4_copy_delegation_stateid - Copy inode's state ID information
   * @dst: stateid data structure to fill in
   * @inode: inode to check
+ * @flags: delegation type requirement
   *
- * Returns one and fills in "dst->data" * if inode had a delegation,
- * otherwise zero is returned.
+ * Returns "true" and fills in "dst->data" * if inode had a delegation,
+ * otherwise "false" is returned.
   */
-int nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode)
+bool nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode,
+               fmode_t flags)
  {
         struct nfs_inode *nfsi = NFS_I(inode);
         struct nfs_delegation *delegation;
-       int ret = 0;
+       bool ret;
  
+       flags &= FMODE_READ|FMODE_WRITE;
         rcu_read_lock();
         delegation = rcu_dereference(nfsi->delegation);
-       if (delegation != NULL) {
-               memcpy(dst->data, delegation->stateid.data, sizeof(dst->data));
-               ret = 1;
+       ret = (delegation != NULL && (delegation->type & flags) == flags);
+       if (ret) {
+               nfs4_stateid_copy(dst, &delegation->stateid);
+               nfs_mark_delegation_referenced(delegation);
         }
         rcu_read_unlock();
         return ret;
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h

index d9322e490c56ff98a39e79295186653e0e80589e..cd6a7a8dadae9054e5bd5557c0226bfd8accc116 100644 (file)
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -42,9 +42,9 @@ void nfs_super_return_all_delegations(struct super_block *sb);
  void nfs_expire_all_delegations(struct nfs_client *clp);
  void nfs_expire_all_delegation_types(struct nfs_client *clp, fmode_t flags);
  void nfs_expire_unreferenced_delegations(struct nfs_client *clp);
-void nfs_handle_cb_pathdown(struct nfs_client *clp);
  int nfs_client_return_marked_delegations(struct nfs_client *clp);
  int nfs_delegations_present(struct nfs_client *clp);
+void nfs_remove_bad_delegation(struct inode *inode);
  
  void nfs_delegation_mark_reclaim(struct nfs_client *clp);
  void nfs_delegation_reap_unclaimed(struct nfs_client *clp);
@@ -53,7 +53,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp);
  int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid, int issync);
  int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state *state, const nfs4_stateid *stateid);
  int nfs4_lock_delegation_recall(struct nfs4_state *state, struct file_lock *fl);
-int nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode);
+bool nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode, fmode_t flags);
  
  void nfs_mark_delegation_referenced(struct nfs_delegation *delegation);
  int nfs_have_delegation(struct inode *inode, fmode_t flags);
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c

index 32aa6917265a388bafc582cddb5c803d38605a5c..4aaf0316d76a040a1e17e60e00060b28005d59a2 100644 (file)
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -207,7 +207,7 @@ struct nfs_cache_array_entry {
  };
  
  struct nfs_cache_array {
-       unsigned int size;
+       int size;
         int eof_index;
         u64 last_cookie;
         struct nfs_cache_array_entry array[0];
@@ -1429,6 +1429,7 @@ static struct dentry *nfs_atomic_lookup(struct inode *dir, struct dentry *dentry
         }
  
         open_flags = nd->intent.open.flags;
+       attr.ia_valid = 0;
  
         ctx = create_nfs_open_context(dentry, open_flags);
         res = ERR_CAST(ctx);
@@ -1437,11 +1438,14 @@ static struct dentry *nfs_atomic_lookup(struct inode *dir, struct dentry *dentry
  
         if (nd->flags & LOOKUP_CREATE) {
                 attr.ia_mode = nd->intent.open.create_mode;
-               attr.ia_valid = ATTR_MODE;
+               attr.ia_valid |= ATTR_MODE;
                 attr.ia_mode &= ~current_umask();
-       } else {
+       } else
                 open_flags &= ~(O_EXCL | O_CREAT);
-               attr.ia_valid = 0;
+
+       if (open_flags & O_TRUNC) {
+               attr.ia_valid |= ATTR_SIZE;
+               attr.ia_size = 0;
         }
  
         /* Open the file on the server */
@@ -1495,6 +1499,7 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
         struct inode *inode;
         struct inode *dir;
         struct nfs_open_context *ctx;
+       struct iattr attr;
         int openflags, ret = 0;
  
         if (nd->flags & LOOKUP_RCU)
@@ -1523,19 +1528,27 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
         /* We cannot do exclusive creation on a positive dentry */
         if ((openflags & (O_CREAT|O_EXCL)) == (O_CREAT|O_EXCL))
                 goto no_open_dput;
-       /* We can't create new files, or truncate existing ones here */
-       openflags &= ~(O_CREAT|O_EXCL|O_TRUNC);
+       /* We can't create new files here */
+       openflags &= ~(O_CREAT|O_EXCL);
  
         ctx = create_nfs_open_context(dentry, openflags);
         ret = PTR_ERR(ctx);
         if (IS_ERR(ctx))
                 goto out;
+
+       attr.ia_valid = 0;
+       if (openflags & O_TRUNC) {
+               attr.ia_valid |= ATTR_SIZE;
+               attr.ia_size = 0;
+               nfs_wb_all(inode);
+       }
+
         /*
          * Note: we're not holding inode->i_mutex and so may be racing with
          * operations that change the directory. We therefore save the
          * change attribute *before* we do the RPC call.
          */
-       inode = NFS_PROTO(dir)->open_context(dir, ctx, openflags, NULL);
+       inode = NFS_PROTO(dir)->open_context(dir, ctx, openflags, &attr);
         if (IS_ERR(inode)) {
                 ret = PTR_ERR(inode);
                 switch (ret) {
diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c

index 1940f1a56a5fe059cac63144a4f17cdbf26ab1da..9c7f66ac6cc2ad2d40f7eec79f931f205b96a310 100644 (file)
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -265,9 +265,7 @@ static void nfs_direct_read_release(void *calldata)
  }
  
  static const struct rpc_call_ops nfs_read_direct_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_read_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_direct_read_result,
         .rpc_release = nfs_direct_read_release,
  };
@@ -554,9 +552,7 @@ static void nfs_direct_commit_release(void *calldata)
  }
  
  static const struct rpc_call_ops nfs_commit_direct_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_write_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_direct_commit_result,
         .rpc_release = nfs_direct_commit_release,
  };
@@ -696,9 +692,7 @@ out_unlock:
  }
  
  static const struct rpc_call_ops nfs_write_direct_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_write_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_direct_write_result,
         .rpc_release = nfs_direct_write_release,
  };
diff --git a/fs/nfs/dns_resolve.c b/fs/nfs/dns_resolve.c

index a6e711ad130f9fdb456ad2a09b056214815af4f5..b3924b8a600021e27c89fc6c9ff910a514aec9eb 100644 (file)
--- a/fs/nfs/dns_resolve.c
+++ b/fs/nfs/dns_resolve.c
@@ -10,8 +10,9 @@
  
  #include <linux/sunrpc/clnt.h>
  #include <linux/dns_resolver.h>
+#include "dns_resolve.h"
  
-ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
+ssize_t nfs_dns_resolve_name(struct net *net, char *name, size_t namelen,
                 struct sockaddr *sa, size_t salen)
  {
         ssize_t ret;
@@ -20,7 +21,7 @@ ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
  
         ip_len = dns_query(NULL, name, namelen, NULL, &ip_addr, NULL);
         if (ip_len > 0)
-               ret = rpc_pton(ip_addr, ip_len, sa, salen);
+               ret = rpc_pton(net, ip_addr, ip_len, sa, salen);
         else
                 ret = -ESRCH;
         kfree(ip_addr);
@@ -40,15 +41,15 @@ ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
  #include <linux/sunrpc/clnt.h>
  #include <linux/sunrpc/cache.h>
  #include <linux/sunrpc/svcauth.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
  
  #include "dns_resolve.h"
  #include "cache_lib.h"
+#include "netns.h"
  
  #define NFS_DNS_HASHBITS 4
  #define NFS_DNS_HASHTBL_SIZE (1 << NFS_DNS_HASHBITS)
  
-static struct cache_head *nfs_dns_table[NFS_DNS_HASHTBL_SIZE];
-
  struct nfs_dns_ent {
         struct cache_head h;
  
@@ -224,7 +225,7 @@ static int nfs_dns_parse(struct cache_detail *cd, char *buf, int buflen)
         len = qword_get(&buf, buf1, sizeof(buf1));
         if (len <= 0)
                 goto out;
-       key.addrlen = rpc_pton(buf1, len,
+       key.addrlen = rpc_pton(cd->net, buf1, len,
                         (struct sockaddr *)&key.addr,
                         sizeof(key.addr));
  
@@ -259,21 +260,6 @@ out:
         return ret;
  }
  
-static struct cache_detail nfs_dns_resolve = {
-       .owner = THIS_MODULE,
-       .hash_size = NFS_DNS_HASHTBL_SIZE,
-       .hash_table = nfs_dns_table,
-       .name = "dns_resolve",
-       .cache_put = nfs_dns_ent_put,
-       .cache_upcall = nfs_dns_upcall,
-       .cache_parse = nfs_dns_parse,
-       .cache_show = nfs_dns_show,
-       .match = nfs_dns_match,
-       .init = nfs_dns_ent_init,
-       .update = nfs_dns_ent_update,
-       .alloc = nfs_dns_ent_alloc,
-};
-
  static int do_cache_lookup(struct cache_detail *cd,
                 struct nfs_dns_ent *key,
                 struct nfs_dns_ent **item,
@@ -336,8 +322,8 @@ out:
         return ret;
  }
  
-ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
-               struct sockaddr *sa, size_t salen)
+ssize_t nfs_dns_resolve_name(struct net *net, char *name,
+               size_t namelen, struct sockaddr *sa, size_t salen)
  {
         struct nfs_dns_ent key = {
                 .hostname = name,
@@ -345,28 +331,118 @@ ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
         };
         struct nfs_dns_ent *item = NULL;
         ssize_t ret;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
  
-       ret = do_cache_lookup_wait(&nfs_dns_resolve, &key, &item);
+       ret = do_cache_lookup_wait(nn->nfs_dns_resolve, &key, &item);
         if (ret == 0) {
                 if (salen >= item->addrlen) {
                         memcpy(sa, &item->addr, item->addrlen);
                         ret = item->addrlen;
                 } else
                         ret = -EOVERFLOW;
-               cache_put(&item->h, &nfs_dns_resolve);
+               cache_put(&item->h, nn->nfs_dns_resolve);
         } else if (ret == -ENOENT)
                 ret = -ESRCH;
         return ret;
  }
  
+int nfs_dns_resolver_cache_init(struct net *net)
+{
+       int err = -ENOMEM;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct cache_detail *cd;
+       struct cache_head **tbl;
+
+       cd = kzalloc(sizeof(struct cache_detail), GFP_KERNEL);
+       if (cd == NULL)
+               goto err_cd;
+
+       tbl = kzalloc(NFS_DNS_HASHTBL_SIZE * sizeof(struct cache_head *),
+                       GFP_KERNEL);
+       if (tbl == NULL)
+               goto err_tbl;
+
+       cd->owner = THIS_MODULE,
+       cd->hash_size = NFS_DNS_HASHTBL_SIZE,
+       cd->hash_table = tbl,
+       cd->name = "dns_resolve",
+       cd->cache_put = nfs_dns_ent_put,
+       cd->cache_upcall = nfs_dns_upcall,
+       cd->cache_parse = nfs_dns_parse,
+       cd->cache_show = nfs_dns_show,
+       cd->match = nfs_dns_match,
+       cd->init = nfs_dns_ent_init,
+       cd->update = nfs_dns_ent_update,
+       cd->alloc = nfs_dns_ent_alloc,
+
+       nfs_cache_init(cd);
+       err = nfs_cache_register_net(net, cd);
+       if (err)
+               goto err_reg;
+       nn->nfs_dns_resolve = cd;
+       return 0;
+
+err_reg:
+       nfs_cache_destroy(cd);
+       kfree(cd->hash_table);
+err_tbl:
+       kfree(cd);
+err_cd:
+       return err;
+}
+
+void nfs_dns_resolver_cache_destroy(struct net *net)
+{
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct cache_detail *cd = nn->nfs_dns_resolve;
+
+       nfs_cache_unregister_net(net, cd);
+       nfs_cache_destroy(cd);
+       kfree(cd->hash_table);
+       kfree(cd);
+}
+
+static int rpc_pipefs_event(struct notifier_block *nb, unsigned long event,
+                          void *ptr)
+{
+       struct super_block *sb = ptr;
+       struct net *net = sb->s_fs_info;
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct cache_detail *cd = nn->nfs_dns_resolve;
+       int ret = 0;
+
+       if (cd == NULL)
+               return 0;
+
+       if (!try_module_get(THIS_MODULE))
+               return 0;
+
+       switch (event) {
+       case RPC_PIPEFS_MOUNT:
+               ret = nfs_cache_register_sb(sb, cd);
+               break;
+       case RPC_PIPEFS_UMOUNT:
+               nfs_cache_unregister_sb(sb, cd);
+               break;
+       default:
+               ret = -ENOTSUPP;
+               break;
+       }
+       module_put(THIS_MODULE);
+       return ret;
+}
+
+static struct notifier_block nfs_dns_resolver_block = {
+       .notifier_call  = rpc_pipefs_event,
+};
+
  int nfs_dns_resolver_init(void)
  {
-       return nfs_cache_register(&nfs_dns_resolve);
+       return rpc_pipefs_notifier_register(&nfs_dns_resolver_block);
  }
  
  void nfs_dns_resolver_destroy(void)
  {
-       nfs_cache_unregister(&nfs_dns_resolve);
+       rpc_pipefs_notifier_unregister(&nfs_dns_resolver_block);
  }
-
  #endif
diff --git a/fs/nfs/dns_resolve.h b/fs/nfs/dns_resolve.h

index 199bb5543a91ad3dfc1e03e2f85894f78661e4cb..2e4f596d2923d5876b685fb9fb8bdc0ed13c2de6 100644 (file)
--- a/fs/nfs/dns_resolve.h
+++ b/fs/nfs/dns_resolve.h
@@ -15,12 +15,22 @@ static inline int nfs_dns_resolver_init(void)
  
  static inline void nfs_dns_resolver_destroy(void)
  {}
+
+static inline int nfs_dns_resolver_cache_init(struct net *net)
+{
+       return 0;
+}
+
+static inline void nfs_dns_resolver_cache_destroy(struct net *net)
+{}
  #else
  extern int nfs_dns_resolver_init(void);
  extern void nfs_dns_resolver_destroy(void);
+extern int nfs_dns_resolver_cache_init(struct net *net);
+extern void nfs_dns_resolver_cache_destroy(struct net *net);
  #endif
  
-extern ssize_t nfs_dns_resolve_name(char *name, size_t namelen,
-               struct sockaddr *sa, size_t salen);
+extern ssize_t nfs_dns_resolve_name(struct net *net, char *name,
+               size_t namelen, struct sockaddr *sa, size_t salen);
  
  #endif
diff --git a/fs/nfs/file.c b/fs/nfs/file.c

index c43a452f7da2e70c084bddb7dfe6415d194bdf3e..4fdaaa63cf1c3f0d72ca58088ba312026881aa19 100644 (file)
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -530,6 +530,8 @@ static int nfs_vm_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
         if (mapping != dentry->d_inode->i_mapping)
                 goto out_unlock;
  
+       wait_on_page_writeback(page);
+
         pagelen = nfs_page_length(page);
         if (pagelen == 0)
                 goto out_unlock;
diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c

index 419119c371bf81d3a5487448a289cf5e3e219314..ae65c16b3670ebb5ed6da1d214f2cde16fd7f73e 100644 (file)
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -327,7 +327,7 @@ void nfs_fscache_reset_inode_cookie(struct inode *inode)
  {
         struct nfs_inode *nfsi = NFS_I(inode);
         struct nfs_server *nfss = NFS_SERVER(inode);
-       struct fscache_cookie *old = nfsi->fscache;
+       NFS_IFDEBUG(struct fscache_cookie *old = nfsi->fscache);
  
         nfs_fscache_inode_lock(inode);
         if (nfsi->fscache) {
diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c

index a1bbf7780dfcec3e1a1dd0c608d3e88842db8820..b7f348bb618b8d8864f7a5e2569aff8f4ef4e3a7 100644 (file)
--- a/fs/nfs/idmap.c
+++ b/fs/nfs/idmap.c
@@ -34,11 +34,29 @@
   *  SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
   */
  #include <linux/types.h>
-#include <linux/string.h>
-#include <linux/kernel.h>
-#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/fs.h>
  #include <linux/nfs_idmap.h>
+#include <net/net_namespace.h>
+#include <linux/sunrpc/rpc_pipe_fs.h>
  #include <linux/nfs_fs.h>
+#include <linux/nfs_fs_sb.h>
+#include <linux/key.h>
+#include <linux/keyctl.h>
+#include <linux/key-type.h>
+#include <keys/user-type.h>
+#include <linux/module.h>
+
+#include "internal.h"
+#include "netns.h"
+
+#define NFS_UINT_MAXLEN 11
+
+/* Default cache timeout is 10 minutes */
+unsigned int nfs_idmap_cache_timeout = 600;
+static const struct cred *id_resolver_cache;
+static struct key_type key_type_id_resolver_legacy;
+
  
  /**
   * nfs_fattr_init_names - initialise the nfs_fattr owner_name/group_name fields
@@ -142,24 +160,7 @@ static int nfs_map_numeric_to_string(__u32 id, char *buf, size_t buflen)
         return snprintf(buf, buflen, "%u", id);
  }
  
-#ifdef CONFIG_NFS_USE_NEW_IDMAPPER
-
-#include <linux/cred.h>
-#include <linux/sunrpc/sched.h>
-#include <linux/nfs4.h>
-#include <linux/nfs_fs_sb.h>
-#include <linux/keyctl.h>
-#include <linux/key-type.h>
-#include <linux/rcupdate.h>
-#include <linux/err.h>
-
-#include <keys/user-type.h>
-
-#define NFS_UINT_MAXLEN 11
-
-const struct cred *id_resolver_cache;
-
-struct key_type key_type_id_resolver = {
+static struct key_type key_type_id_resolver = {
         .name           = "id_resolver",
         .instantiate    = user_instantiate,
         .match          = user_match,
@@ -169,13 +170,14 @@ struct key_type key_type_id_resolver = {
         .read           = user_read,
  };
  
-int nfs_idmap_init(void)
+static int nfs_idmap_init_keyring(void)
  {
         struct cred *cred;
         struct key *keyring;
         int ret = 0;
  
-       printk(KERN_NOTICE "Registering the %s key type\n", key_type_id_resolver.name);
+       printk(KERN_NOTICE "NFS: Registering the %s key type\n",
+               key_type_id_resolver.name);
  
         cred = prepare_kernel_cred(NULL);
         if (!cred)
@@ -211,7 +213,7 @@ failed_put_cred:
         return ret;
  }
  
-void nfs_idmap_quit(void)
+static void nfs_idmap_quit_keyring(void)
  {
         key_revoke(id_resolver_cache->thread_keyring);
         unregister_key_type(&key_type_id_resolver);
@@ -246,8 +248,10 @@ static ssize_t nfs_idmap_get_desc(const char *name, size_t namelen,
         return desclen;
  }
  
-static ssize_t nfs_idmap_request_key(const char *name, size_t namelen,
-               const char *type, void *data, size_t data_size)
+static ssize_t nfs_idmap_request_key(struct key_type *key_type,
+                                    const char *name, size_t namelen,
+                                    const char *type, void *data,
+                                    size_t data_size, struct idmap *idmap)
  {
         const struct cred *saved_cred;
         struct key *rkey;
@@ -260,8 +264,12 @@ static ssize_t nfs_idmap_request_key(const char *name, size_t namelen,
                 goto out;
  
         saved_cred = override_creds(id_resolver_cache);
-       rkey = request_key(&key_type_id_resolver, desc, "");
+       if (idmap)
+               rkey = request_key_with_auxdata(key_type, desc, "", 0, idmap);
+       else
+               rkey = request_key(&key_type_id_resolver, desc, "");
         revert_creds(saved_cred);
+
         kfree(desc);
         if (IS_ERR(rkey)) {
                 ret = PTR_ERR(rkey);
@@ -294,31 +302,46 @@ out:
         return ret;
  }
  
+static ssize_t nfs_idmap_get_key(const char *name, size_t namelen,
+                                const char *type, void *data,
+                                size_t data_size, struct idmap *idmap)
+{
+       ssize_t ret = nfs_idmap_request_key(&key_type_id_resolver,
+                                           name, namelen, type, data,
+                                           data_size, NULL);
+       if (ret < 0) {
+               ret = nfs_idmap_request_key(&key_type_id_resolver_legacy,
+                                           name, namelen, type, data,
+                                           data_size, idmap);
+       }
+       return ret;
+}
  
  /* ID -> Name */
-static ssize_t nfs_idmap_lookup_name(__u32 id, const char *type, char *buf, size_t buflen)
+static ssize_t nfs_idmap_lookup_name(__u32 id, const char *type, char *buf,
+                                    size_t buflen, struct idmap *idmap)
  {
         char id_str[NFS_UINT_MAXLEN];
         int id_len;
         ssize_t ret;
  
         id_len = snprintf(id_str, sizeof(id_str), "%u", id);
-       ret = nfs_idmap_request_key(id_str, id_len, type, buf, buflen);
+       ret = nfs_idmap_get_key(id_str, id_len, type, buf, buflen, idmap);
         if (ret < 0)
                 return -EINVAL;
         return ret;
  }
  
  /* Name -> ID */
-static int nfs_idmap_lookup_id(const char *name, size_t namelen,
-                               const char *type, __u32 *id)
+static int nfs_idmap_lookup_id(const char *name, size_t namelen, const char *type,
+                              __u32 *id, struct idmap *idmap)
  {
         char id_str[NFS_UINT_MAXLEN];
         long id_long;
         ssize_t data_size;
         int ret = 0;
  
-       data_size = nfs_idmap_request_key(name, namelen, type, id_str, NFS_UINT_MAXLEN);
+       data_size = nfs_idmap_get_key(name, namelen, type, id_str, NFS_UINT_MAXLEN, idmap);
         if (data_size <= 0) {
                 ret = -EINVAL;
         } else {
@@ -328,114 +351,103 @@ static int nfs_idmap_lookup_id(const char *name, size_t namelen,
         return ret;
  }
  
-int nfs_map_name_to_uid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *uid)
-{
-       if (nfs_map_string_to_numeric(name, namelen, uid))
-               return 0;
-       return nfs_idmap_lookup_id(name, namelen, "uid", uid);
-}
-
-int nfs_map_group_to_gid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *gid)
-{
-       if (nfs_map_string_to_numeric(name, namelen, gid))
-               return 0;
-       return nfs_idmap_lookup_id(name, namelen, "gid", gid);
-}
-
-int nfs_map_uid_to_name(const struct nfs_server *server, __u32 uid, char *buf, size_t buflen)
-{
-       int ret = -EINVAL;
-
-       if (!(server->caps & NFS_CAP_UIDGID_NOMAP))
-               ret = nfs_idmap_lookup_name(uid, "user", buf, buflen);
-       if (ret < 0)
-               ret = nfs_map_numeric_to_string(uid, buf, buflen);
-       return ret;
-}
-int nfs_map_gid_to_group(const struct nfs_server *server, __u32 gid, char *buf, size_t buflen)
-{
-       int ret = -EINVAL;
+/* idmap classic begins here */
+module_param(nfs_idmap_cache_timeout, int, 0644);
  
-       if (!(server->caps & NFS_CAP_UIDGID_NOMAP))
-               ret = nfs_idmap_lookup_name(gid, "group", buf, buflen);
-       if (ret < 0)
-               ret = nfs_map_numeric_to_string(gid, buf, buflen);
-       return ret;
-}
-
-#else  /* CONFIG_NFS_USE_NEW_IDMAPPER not defined */
-
-#include <linux/module.h>
-#include <linux/mutex.h>
-#include <linux/init.h>
-#include <linux/socket.h>
-#include <linux/in.h>
-#include <linux/sched.h>
-#include <linux/sunrpc/clnt.h>
-#include <linux/workqueue.h>
-#include <linux/sunrpc/rpc_pipe_fs.h>
-
-#include <linux/nfs_fs.h>
-
-#include "nfs4_fs.h"
-
-#define IDMAP_HASH_SZ          128
-
-/* Default cache timeout is 10 minutes */
-unsigned int nfs_idmap_cache_timeout = 600 * HZ;
-
-static int param_set_idmap_timeout(const char *val, struct kernel_param *kp)
-{
-       char *endp;
-       int num = simple_strtol(val, &endp, 0);
-       int jif = num * HZ;
-       if (endp == val || *endp || num < 0 || jif < num)
-               return -EINVAL;
-       *((int *)kp->arg) = jif;
-       return 0;
-}
-
-module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int,
-                &nfs_idmap_cache_timeout, 0644);
-
-struct idmap_hashent {
-       unsigned long           ih_expires;
-       __u32                   ih_id;
-       size_t                  ih_namelen;
-       char                    ih_name[IDMAP_NAMESZ];
+struct idmap {
+       struct rpc_pipe         *idmap_pipe;
+       struct key_construction *idmap_key_cons;
  };
  
-struct idmap_hashtable {
-       __u8                    h_type;
-       struct idmap_hashent    h_entries[IDMAP_HASH_SZ];
+enum {
+       Opt_find_uid, Opt_find_gid, Opt_find_user, Opt_find_group, Opt_find_err
  };
  
-struct idmap {
-       struct dentry           *idmap_dentry;
-       wait_queue_head_t       idmap_wq;
-       struct idmap_msg        idmap_im;
-       struct mutex            idmap_lock;     /* Serializes upcalls */
-       struct mutex            idmap_im_lock;  /* Protects the hashtable */
-       struct idmap_hashtable  idmap_user_hash;
-       struct idmap_hashtable  idmap_group_hash;
+static const match_table_t nfs_idmap_tokens = {
+       { Opt_find_uid, "uid:%s" },
+       { Opt_find_gid, "gid:%s" },
+       { Opt_find_user, "user:%s" },
+       { Opt_find_group, "group:%s" },
+       { Opt_find_err, NULL }
  };
  
+static int nfs_idmap_legacy_upcall(struct key_construction *, const char *, void *);
  static ssize_t idmap_pipe_downcall(struct file *, const char __user *,
                                    size_t);
  static void idmap_pipe_destroy_msg(struct rpc_pipe_msg *);
  
-static unsigned int fnvhash32(const void *, size_t);
-
  static const struct rpc_pipe_ops idmap_upcall_ops = {
         .upcall         = rpc_pipe_generic_upcall,
         .downcall       = idmap_pipe_downcall,
         .destroy_msg    = idmap_pipe_destroy_msg,
  };
  
+static struct key_type key_type_id_resolver_legacy = {
+       .name           = "id_resolver",
+       .instantiate    = user_instantiate,
+       .match          = user_match,
+       .revoke         = user_revoke,
+       .destroy        = user_destroy,
+       .describe       = user_describe,
+       .read           = user_read,
+       .request_key    = nfs_idmap_legacy_upcall,
+};
+
+static void __nfs_idmap_unregister(struct rpc_pipe *pipe)
+{
+       if (pipe->dentry)
+               rpc_unlink(pipe->dentry);
+}
+
+static int __nfs_idmap_register(struct dentry *dir,
+                                    struct idmap *idmap,
+                                    struct rpc_pipe *pipe)
+{
+       struct dentry *dentry;
+
+       dentry = rpc_mkpipe_dentry(dir, "idmap", idmap, pipe);
+       if (IS_ERR(dentry))
+               return PTR_ERR(dentry);
+       pipe->dentry = dentry;
+       return 0;
+}
+
+static void nfs_idmap_unregister(struct nfs_client *clp,
+                                     struct rpc_pipe *pipe)
+{
+       struct net *net = clp->net;
+       struct super_block *pipefs_sb;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               __nfs_idmap_unregister(pipe);
+               rpc_put_sb_net(net);
+       }
+}
+
+static int nfs_idmap_register(struct nfs_client *clp,
+                                  struct idmap *idmap,
+                                  struct rpc_pipe *pipe)
+{
+       struct net *net = clp->net;
+       struct super_block *pipefs_sb;
+       int err = 0;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               if (clp->cl_rpcclient->cl_dentry)
+                       err = __nfs_idmap_register(clp->cl_rpcclient->cl_dentry,
+                                                  idmap, pipe);
+               rpc_put_sb_net(net);
+       }
+       return err;
+}
+
  int
  nfs_idmap_new(struct nfs_client *clp)
  {
         struct idmap *idmap;
+       struct rpc_pipe *pipe;
         int error;
  
         BUG_ON(clp->cl_idmap != NULL);
@@ -444,19 +456,19 @@ nfs_idmap_new(struct nfs_client *clp)
         if (idmap == NULL)
                 return -ENOMEM;
  
-       idmap->idmap_dentry = rpc_mkpipe(clp->cl_rpcclient->cl_path.dentry,
-                       "idmap", idmap, &idmap_upcall_ops, 0);
-       if (IS_ERR(idmap->idmap_dentry)) {
-               error = PTR_ERR(idmap->idmap_dentry);
+       pipe = rpc_mkpipe_data(&idmap_upcall_ops, 0);
+       if (IS_ERR(pipe)) {
+               error = PTR_ERR(pipe);
                 kfree(idmap);
                 return error;
         }
-
-       mutex_init(&idmap->idmap_lock);
-       mutex_init(&idmap->idmap_im_lock);
-       init_waitqueue_head(&idmap->idmap_wq);
-       idmap->idmap_user_hash.h_type = IDMAP_TYPE_USER;
-       idmap->idmap_group_hash.h_type = IDMAP_TYPE_GROUP;
+       error = nfs_idmap_register(clp, idmap, pipe);
+       if (error) {
+               rpc_destroy_pipe_data(pipe);
+               kfree(idmap);
+               return error;
+       }
+       idmap->idmap_pipe = pipe;
  
         clp->cl_idmap = idmap;
         return 0;
@@ -469,211 +481,220 @@ nfs_idmap_delete(struct nfs_client *clp)
  
         if (!idmap)
                 return;
-       rpc_unlink(idmap->idmap_dentry);
+       nfs_idmap_unregister(clp, idmap->idmap_pipe);
+       rpc_destroy_pipe_data(idmap->idmap_pipe);
         clp->cl_idmap = NULL;
         kfree(idmap);
  }
  
-/*
- * Helper routines for manipulating the hashtable
- */
-static inline struct idmap_hashent *
-idmap_name_hash(struct idmap_hashtable* h, const char *name, size_t len)
-{
-       return &h->h_entries[fnvhash32(name, len) % IDMAP_HASH_SZ];
-}
-
-static struct idmap_hashent *
-idmap_lookup_name(struct idmap_hashtable *h, const char *name, size_t len)
+static int __rpc_pipefs_event(struct nfs_client *clp, unsigned long event,
+                             struct super_block *sb)
  {
-       struct idmap_hashent *he = idmap_name_hash(h, name, len);
+       int err = 0;
  
-       if (he->ih_namelen != len || memcmp(he->ih_name, name, len) != 0)
-               return NULL;
-       if (time_after(jiffies, he->ih_expires))
-               return NULL;
-       return he;
+       switch (event) {
+       case RPC_PIPEFS_MOUNT:
+               BUG_ON(clp->cl_rpcclient->cl_dentry == NULL);
+               err = __nfs_idmap_register(clp->cl_rpcclient->cl_dentry,
+                                               clp->cl_idmap,
+                                               clp->cl_idmap->idmap_pipe);
+               break;
+       case RPC_PIPEFS_UMOUNT:
+               if (clp->cl_idmap->idmap_pipe) {
+                       struct dentry *parent;
+
+                       parent = clp->cl_idmap->idmap_pipe->dentry->d_parent;
+                       __nfs_idmap_unregister(clp->cl_idmap->idmap_pipe);
+                       /*
+                        * Note: This is a dirty hack. SUNRPC hook has been
+                        * called already but simple_rmdir() call for the
+                        * directory returned with error because of idmap pipe
+                        * inside. Thus now we have to remove this directory
+                        * here.
+                        */
+                       if (rpc_rmdir(parent))
+                               printk(KERN_ERR "NFS: %s: failed to remove "
+                                       "clnt dir!\n", __func__);
+               }
+               break;
+       default:
+               printk(KERN_ERR "NFS: %s: unknown event: %ld\n", __func__,
+                       event);
+               return -ENOTSUPP;
+       }
+       return err;
+}
+
+static struct nfs_client *nfs_get_client_for_event(struct net *net, int event)
+{
+       struct nfs_net *nn = net_generic(net, nfs_net_id);
+       struct dentry *cl_dentry;
+       struct nfs_client *clp;
+
+       spin_lock(&nn->nfs_client_lock);
+       list_for_each_entry(clp, &nn->nfs_client_list, cl_share_link) {
+               if (clp->rpc_ops != &nfs_v4_clientops)
+                       continue;
+               cl_dentry = clp->cl_idmap->idmap_pipe->dentry;
+               if (((event == RPC_PIPEFS_MOUNT) && cl_dentry) ||
+                   ((event == RPC_PIPEFS_UMOUNT) && !cl_dentry))
+                       continue;
+               atomic_inc(&clp->cl_count);
+               spin_unlock(&nn->nfs_client_lock);
+               return clp;
+       }
+       spin_unlock(&nn->nfs_client_lock);
+       return NULL;
  }
  
-static inline struct idmap_hashent *
-idmap_id_hash(struct idmap_hashtable* h, __u32 id)
+static int rpc_pipefs_event(struct notifier_block *nb, unsigned long event,
+                           void *ptr)
  {
-       return &h->h_entries[fnvhash32(&id, sizeof(id)) % IDMAP_HASH_SZ];
-}
+       struct super_block *sb = ptr;
+       struct nfs_client *clp;
+       int error = 0;
  
-static struct idmap_hashent *
-idmap_lookup_id(struct idmap_hashtable *h, __u32 id)
-{
-       struct idmap_hashent *he = idmap_id_hash(h, id);
-       if (he->ih_id != id || he->ih_namelen == 0)
-               return NULL;
-       if (time_after(jiffies, he->ih_expires))
-               return NULL;
-       return he;
+       while ((clp = nfs_get_client_for_event(sb->s_fs_info, event))) {
+               error = __rpc_pipefs_event(clp, event, sb);
+               nfs_put_client(clp);
+               if (error)
+                       break;
+       }
+       return error;
  }
  
-/*
- * Routines for allocating new entries in the hashtable.
- * For now, we just have 1 entry per bucket, so it's all
- * pretty trivial.
- */
-static inline struct idmap_hashent *
-idmap_alloc_name(struct idmap_hashtable *h, char *name, size_t len)
-{
-       return idmap_name_hash(h, name, len);
-}
+#define PIPEFS_NFS_PRIO                1
+
+static struct notifier_block nfs_idmap_block = {
+       .notifier_call  = rpc_pipefs_event,
+       .priority       = SUNRPC_PIPEFS_NFS_PRIO,
+};
  
-static inline struct idmap_hashent *
-idmap_alloc_id(struct idmap_hashtable *h, __u32 id)
+int nfs_idmap_init(void)
  {
-       return idmap_id_hash(h, id);
+       int ret;
+       ret = nfs_idmap_init_keyring();
+       if (ret != 0)
+               goto out;
+       ret = rpc_pipefs_notifier_register(&nfs_idmap_block);
+       if (ret != 0)
+               nfs_idmap_quit_keyring();
+out:
+       return ret;
  }
  
-static void
-idmap_update_entry(struct idmap_hashent *he, const char *name,
-               size_t namelen, __u32 id)
+void nfs_idmap_quit(void)
  {
-       he->ih_id = id;
-       memcpy(he->ih_name, name, namelen);
-       he->ih_name[namelen] = '\0';
-       he->ih_namelen = namelen;
-       he->ih_expires = jiffies + nfs_idmap_cache_timeout;
+       rpc_pipefs_notifier_unregister(&nfs_idmap_block);
+       nfs_idmap_quit_keyring();
  }
  
-/*
- * Name -> ID
- */
-static int
-nfs_idmap_id(struct idmap *idmap, struct idmap_hashtable *h,
-               const char *name, size_t namelen, __u32 *id)
+static int nfs_idmap_prepare_message(char *desc, struct idmap_msg *im,
+                                    struct rpc_pipe_msg *msg)
  {
-       struct rpc_pipe_msg msg;
-       struct idmap_msg *im;
-       struct idmap_hashent *he;
-       DECLARE_WAITQUEUE(wq, current);
-       int ret = -EIO;
-
-       im = &idmap->idmap_im;
-
-       /*
-        * String sanity checks
-        * Note that the userland daemon expects NUL terminated strings
-        */
-       for (;;) {
-               if (namelen == 0)
-                       return -EINVAL;
-               if (name[namelen-1] != '\0')
-                       break;
-               namelen--;
-       }
-       if (namelen >= IDMAP_NAMESZ)
-               return -EINVAL;
+       substring_t substr;
+       int token, ret;
  
-       mutex_lock(&idmap->idmap_lock);
-       mutex_lock(&idmap->idmap_im_lock);
-
-       he = idmap_lookup_name(h, name, namelen);
-       if (he != NULL) {
-               *id = he->ih_id;
-               ret = 0;
-               goto out;
-       }
+       memset(im,  0, sizeof(*im));
+       memset(msg, 0, sizeof(*msg));
  
-       memset(im, 0, sizeof(*im));
-       memcpy(im->im_name, name, namelen);
+       im->im_type = IDMAP_TYPE_GROUP;
+       token = match_token(desc, nfs_idmap_tokens, &substr);
  
-       im->im_type = h->h_type;
-       im->im_conv = IDMAP_CONV_NAMETOID;
+       switch (token) {
+       case Opt_find_uid:
+               im->im_type = IDMAP_TYPE_USER;
+       case Opt_find_gid:
+               im->im_conv = IDMAP_CONV_NAMETOID;
+               ret = match_strlcpy(im->im_name, &substr, IDMAP_NAMESZ);
+               break;
  
-       memset(&msg, 0, sizeof(msg));
-       msg.data = im;
-       msg.len = sizeof(*im);
+       case Opt_find_user:
+               im->im_type = IDMAP_TYPE_USER;
+       case Opt_find_group:
+               im->im_conv = IDMAP_CONV_IDTONAME;
+               ret = match_int(&substr, &im->im_id);
+               break;
  
-       add_wait_queue(&idmap->idmap_wq, &wq);
-       if (rpc_queue_upcall(idmap->idmap_dentry->d_inode, &msg) < 0) {
-               remove_wait_queue(&idmap->idmap_wq, &wq);
+       default:
+               ret = -EINVAL;
                 goto out;
         }
  
-       set_current_state(TASK_UNINTERRUPTIBLE);
-       mutex_unlock(&idmap->idmap_im_lock);
-       schedule();
-       __set_current_state(TASK_RUNNING);
-       remove_wait_queue(&idmap->idmap_wq, &wq);
-       mutex_lock(&idmap->idmap_im_lock);
+       msg->data = im;
+       msg->len  = sizeof(struct idmap_msg);
  
-       if (im->im_status & IDMAP_STATUS_SUCCESS) {
-               *id = im->im_id;
-               ret = 0;
-       }
-
- out:
-       memset(im, 0, sizeof(*im));
-       mutex_unlock(&idmap->idmap_im_lock);
-       mutex_unlock(&idmap->idmap_lock);
+out:
         return ret;
  }
  
-/*
- * ID -> Name
- */
-static int
-nfs_idmap_name(struct idmap *idmap, struct idmap_hashtable *h,
-               __u32 id, char *name)
+static int nfs_idmap_legacy_upcall(struct key_construction *cons,
+                                  const char *op,
+                                  void *aux)
  {
-       struct rpc_pipe_msg msg;
+       struct rpc_pipe_msg *msg;
         struct idmap_msg *im;
-       struct idmap_hashent *he;
-       DECLARE_WAITQUEUE(wq, current);
-       int ret = -EIO;
-       unsigned int len;
-
-       im = &idmap->idmap_im;
+       struct idmap *idmap = (struct idmap *)aux;
+       struct key *key = cons->key;
+       int ret;
  
-       mutex_lock(&idmap->idmap_lock);
-       mutex_lock(&idmap->idmap_im_lock);
+       /* msg and im are freed in idmap_pipe_destroy_msg */
+       msg = kmalloc(sizeof(*msg), GFP_KERNEL);
+       if (IS_ERR(msg)) {
+               ret = PTR_ERR(msg);
+               goto out0;
+       }
  
-       he = idmap_lookup_id(h, id);
-       if (he) {
-               memcpy(name, he->ih_name, he->ih_namelen);
-               ret = he->ih_namelen;
-               goto out;
+       im = kmalloc(sizeof(*im), GFP_KERNEL);
+       if (IS_ERR(im)) {
+               ret = PTR_ERR(im);
+               goto out1;
         }
  
-       memset(im, 0, sizeof(*im));
-       im->im_type = h->h_type;
-       im->im_conv = IDMAP_CONV_IDTONAME;
-       im->im_id = id;
+       ret = nfs_idmap_prepare_message(key->description, im, msg);
+       if (ret < 0)
+               goto out2;
  
-       memset(&msg, 0, sizeof(msg));
-       msg.data = im;
-       msg.len = sizeof(*im);
+       idmap->idmap_key_cons = cons;
  
-       add_wait_queue(&idmap->idmap_wq, &wq);
+       ret = rpc_queue_upcall(idmap->idmap_pipe, msg);
+       if (ret < 0)
+               goto out2;
  
-       if (rpc_queue_upcall(idmap->idmap_dentry->d_inode, &msg) < 0) {
-               remove_wait_queue(&idmap->idmap_wq, &wq);
-               goto out;
-       }
+       return ret;
+
+out2:
+       kfree(im);
+out1:
+       kfree(msg);
+out0:
+       key_revoke(cons->key);
+       key_revoke(cons->authkey);
+       return ret;
+}
+
+static int nfs_idmap_instantiate(struct key *key, struct key *authkey, char *data)
+{
+       return key_instantiate_and_link(key, data, strlen(data) + 1,
+                                       id_resolver_cache->thread_keyring,
+                                       authkey);
+}
  
-       set_current_state(TASK_UNINTERRUPTIBLE);
-       mutex_unlock(&idmap->idmap_im_lock);
-       schedule();
-       __set_current_state(TASK_RUNNING);
-       remove_wait_queue(&idmap->idmap_wq, &wq);
-       mutex_lock(&idmap->idmap_im_lock);
-
-       if (im->im_status & IDMAP_STATUS_SUCCESS) {
-               if ((len = strnlen(im->im_name, IDMAP_NAMESZ)) == 0)
-                       goto out;
-               memcpy(name, im->im_name, len);
-               ret = len;
+static int nfs_idmap_read_message(struct idmap_msg *im, struct key *key, struct key *authkey)
+{
+       char id_str[NFS_UINT_MAXLEN];
+       int ret = -EINVAL;
+
+       switch (im->im_conv) {
+       case IDMAP_CONV_NAMETOID:
+               sprintf(id_str, "%d", im->im_id);
+               ret = nfs_idmap_instantiate(key, authkey, id_str);
+               break;
+       case IDMAP_CONV_IDTONAME:
+               ret = nfs_idmap_instantiate(key, authkey, im->im_name);
+               break;
         }
  
- out:
-       memset(im, 0, sizeof(*im));
-       mutex_unlock(&idmap->idmap_im_lock);
-       mutex_unlock(&idmap->idmap_lock);
         return ret;
  }
  
@@ -682,115 +703,51 @@ idmap_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
  {
         struct rpc_inode *rpci = RPC_I(filp->f_path.dentry->d_inode);
         struct idmap *idmap = (struct idmap *)rpci->private;
-       struct idmap_msg im_in, *im = &idmap->idmap_im;
-       struct idmap_hashtable *h;
-       struct idmap_hashent *he = NULL;
+       struct key_construction *cons = idmap->idmap_key_cons;
+       struct idmap_msg im;
         size_t namelen_in;
         int ret;
  
-       if (mlen != sizeof(im_in))
-               return -ENOSPC;
-
-       if (copy_from_user(&im_in, src, mlen) != 0)
-               return -EFAULT;
-
-       mutex_lock(&idmap->idmap_im_lock);
-
-       ret = mlen;
-       im->im_status = im_in.im_status;
-       /* If we got an error, terminate now, and wake up pending upcalls */
-       if (!(im_in.im_status & IDMAP_STATUS_SUCCESS)) {
-               wake_up(&idmap->idmap_wq);
+       if (mlen != sizeof(im)) {
+               ret = -ENOSPC;
                 goto out;
         }
  
-       /* Sanity checking of strings */
-       ret = -EINVAL;
-       namelen_in = strnlen(im_in.im_name, IDMAP_NAMESZ);
-       if (namelen_in == 0 || namelen_in == IDMAP_NAMESZ)
+       if (copy_from_user(&im, src, mlen) != 0) {
+               ret = -EFAULT;
                 goto out;
+       }
  
-       switch (im_in.im_type) {
-               case IDMAP_TYPE_USER:
-                       h = &idmap->idmap_user_hash;
-                       break;
-               case IDMAP_TYPE_GROUP:
-                       h = &idmap->idmap_group_hash;
-                       break;
-               default:
-                       goto out;
+       if (!(im.im_status & IDMAP_STATUS_SUCCESS)) {
+               ret = mlen;
+               complete_request_key(idmap->idmap_key_cons, -ENOKEY);
+               goto out_incomplete;
         }
  
-       switch (im_in.im_conv) {
-       case IDMAP_CONV_IDTONAME:
-               /* Did we match the current upcall? */
-               if (im->im_conv == IDMAP_CONV_IDTONAME
-                               && im->im_type == im_in.im_type
-                               && im->im_id == im_in.im_id) {
-                       /* Yes: copy string, including the terminating '\0'  */
-                       memcpy(im->im_name, im_in.im_name, namelen_in);
-                       im->im_name[namelen_in] = '\0';
-                       wake_up(&idmap->idmap_wq);
-               }
-               he = idmap_alloc_id(h, im_in.im_id);
-               break;
-       case IDMAP_CONV_NAMETOID:
-               /* Did we match the current upcall? */
-               if (im->im_conv == IDMAP_CONV_NAMETOID
-                               && im->im_type == im_in.im_type
-                               && strnlen(im->im_name, IDMAP_NAMESZ) == namelen_in
-                               && memcmp(im->im_name, im_in.im_name, namelen_in) == 0) {
-                       im->im_id = im_in.im_id;
-                       wake_up(&idmap->idmap_wq);
-               }
-               he = idmap_alloc_name(h, im_in.im_name, namelen_in);
-               break;
-       default:
+       namelen_in = strnlen(im.im_name, IDMAP_NAMESZ);
+       if (namelen_in == 0 || namelen_in == IDMAP_NAMESZ) {
+               ret = -EINVAL;
                 goto out;
         }
  
-       /* If the entry is valid, also copy it to the cache */
-       if (he != NULL)
-               idmap_update_entry(he, im_in.im_name, namelen_in, im_in.im_id);
-       ret = mlen;
+       ret = nfs_idmap_read_message(&im, cons->key, cons->authkey);
+       if (ret >= 0) {
+               key_set_timeout(cons->key, nfs_idmap_cache_timeout);
+               ret = mlen;
+       }
+
  out:
-       mutex_unlock(&idmap->idmap_im_lock);
+       complete_request_key(idmap->idmap_key_cons, ret);
+out_incomplete:
         return ret;
  }
  
  static void
  idmap_pipe_destroy_msg(struct rpc_pipe_msg *msg)
  {
-       struct idmap_msg *im = msg->data;
-       struct idmap *idmap = container_of(im, struct idmap, idmap_im); 
-
-       if (msg->errno >= 0)
-               return;
-       mutex_lock(&idmap->idmap_im_lock);
-       im->im_status = IDMAP_STATUS_LOOKUPFAIL;
-       wake_up(&idmap->idmap_wq);
-       mutex_unlock(&idmap->idmap_im_lock);
-}
-
-/* 
- * Fowler/Noll/Vo hash
- *    http://www.isthe.com/chongo/tech/comp/fnv/
- */
-
-#define FNV_P_32 ((unsigned int)0x01000193) /* 16777619 */
-#define FNV_1_32 ((unsigned int)0x811c9dc5) /* 2166136261 */
-
-static unsigned int fnvhash32(const void *buf, size_t buflen)
-{
-       const unsigned char *p, *end = (const unsigned char *)buf + buflen;
-       unsigned int hash = FNV_1_32;
-
-       for (p = buf; p < end; p++) {
-               hash *= FNV_P_32;
-               hash ^= (unsigned int)*p;
-       }
-
-       return hash;
+       /* Free memory allocated in nfs_idmap_legacy_upcall() */
+       kfree(msg->data);
+       kfree(msg);
  }
  
  int nfs_map_name_to_uid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *uid)
@@ -799,16 +756,16 @@ int nfs_map_name_to_uid(const struct nfs_server *server, const char *name, size_
  
         if (nfs_map_string_to_numeric(name, namelen, uid))
                 return 0;
-       return nfs_idmap_id(idmap, &idmap->idmap_user_hash, name, namelen, uid);
+       return nfs_idmap_lookup_id(name, namelen, "uid", uid, idmap);
  }
  
-int nfs_map_group_to_gid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *uid)
+int nfs_map_group_to_gid(const struct nfs_server *server, const char *name, size_t namelen, __u32 *gid)
  {
         struct idmap *idmap = server->nfs_client->cl_idmap;
  
-       if (nfs_map_string_to_numeric(name, namelen, uid))
+       if (nfs_map_string_to_numeric(name, namelen, gid))
                 return 0;
-       return nfs_idmap_id(idmap, &idmap->idmap_group_hash, name, namelen, uid);
+       return nfs_idmap_lookup_id(name, namelen, "gid", gid, idmap);
  }
  
  int nfs_map_uid_to_name(const struct nfs_server *server, __u32 uid, char *buf, size_t buflen)
@@ -817,21 +774,19 @@ int nfs_map_uid_to_name(const struct nfs_server *server, __u32 uid, char *buf, s
         int ret = -EINVAL;
  
         if (!(server->caps & NFS_CAP_UIDGID_NOMAP))
-               ret = nfs_idmap_name(idmap, &idmap->idmap_user_hash, uid, buf);
+               ret = nfs_idmap_lookup_name(uid, "user", buf, buflen, idmap);
         if (ret < 0)
                 ret = nfs_map_numeric_to_string(uid, buf, buflen);
         return ret;
  }
-int nfs_map_gid_to_group(const struct nfs_server *server, __u32 uid, char *buf, size_t buflen)
+int nfs_map_gid_to_group(const struct nfs_server *server, __u32 gid, char *buf, size_t buflen)
  {
         struct idmap *idmap = server->nfs_client->cl_idmap;
         int ret = -EINVAL;
  
         if (!(server->caps & NFS_CAP_UIDGID_NOMAP))
-               ret = nfs_idmap_name(idmap, &idmap->idmap_group_hash, uid, buf);
+               ret = nfs_idmap_lookup_name(gid, "group", buf, buflen, idmap);
         if (ret < 0)
-               ret = nfs_map_numeric_to_string(uid, buf, buflen);
+               ret = nfs_map_numeric_to_string(gid, buf, buflen);
         return ret;
  }
-
-#endif /* CONFIG_NFS_USE_NEW_IDMAPPER */
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c

index f649fba8c38489e2cae05eb60e4ead032aca963d..7bb4d13c1cd5ecaa10bbab88b8942ccbd9f9b444 100644 (file)
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -39,6 +39,7 @@
  #include <linux/slab.h>
  #include <linux/compat.h>
  #include <linux/freezer.h>
+#include <linux/crc32.h>
  
  #include <asm/system.h>
  #include <asm/uaccess.h>
@@ -51,6 +52,7 @@
  #include "fscache.h"
  #include "dns_resolve.h"
  #include "pnfs.h"
+#include "netns.h"
  
  #define NFSDBG_FACILITY                NFSDBG_VFS
  
@@ -388,9 +390,10 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr)
                 unlock_new_inode(inode);
         } else
                 nfs_refresh_inode(inode, fattr);
-       dprintk("NFS: nfs_fhget(%s/%Ld ct=%d)\n",
+       dprintk("NFS: nfs_fhget(%s/%Ld fh_crc=0x%08x ct=%d)\n",
                 inode->i_sb->s_id,
                 (long long)NFS_FILEID(inode),
+               nfs_display_fhandle_hash(fh),
                 atomic_read(&inode->i_count));
  
  out:
@@ -401,7 +404,7 @@ out_no_inode:
         goto out;
  }
  
-#define NFS_VALID_ATTRS (ATTR_MODE|ATTR_UID|ATTR_GID|ATTR_SIZE|ATTR_ATIME|ATTR_ATIME_SET|ATTR_MTIME|ATTR_MTIME_SET|ATTR_FILE)
+#define NFS_VALID_ATTRS (ATTR_MODE|ATTR_UID|ATTR_GID|ATTR_SIZE|ATTR_ATIME|ATTR_ATIME_SET|ATTR_MTIME|ATTR_MTIME_SET|ATTR_FILE|ATTR_OPEN)
  
  int
  nfs_setattr(struct dentry *dentry, struct iattr *attr)
@@ -423,7 +426,7 @@ nfs_setattr(struct dentry *dentry, struct iattr *attr)
  
         /* Optimization: if the end result is no change, don't RPC */
         attr->ia_valid &= NFS_VALID_ATTRS;
-       if ((attr->ia_valid & ~ATTR_FILE) == 0)
+       if ((attr->ia_valid & ~(ATTR_FILE|ATTR_OPEN)) == 0)
                 return 0;
  
         /* Write all dirty data */
@@ -1044,6 +1047,67 @@ struct nfs_fh *nfs_alloc_fhandle(void)
         return fh;
  }
  
+#ifdef NFS_DEBUG
+/*
+ * _nfs_display_fhandle_hash - calculate the crc32 hash for the filehandle
+ *                             in the same way that wireshark does
+ *
+ * @fh: file handle
+ *
+ * For debugging only.
+ */
+u32 _nfs_display_fhandle_hash(const struct nfs_fh *fh)
+{
+       /* wireshark uses 32-bit AUTODIN crc and does a bitwise
+        * not on the result */
+       return ~crc32(0xFFFFFFFF, &fh->data[0], fh->size);
+}
+
+/*
+ * _nfs_display_fhandle - display an NFS file handle on the console
+ *
+ * @fh: file handle to display
+ * @caption: display caption
+ *
+ * For debugging only.
+ */
+void _nfs_display_fhandle(const struct nfs_fh *fh, const char *caption)
+{
+       unsigned short i;
+
+       if (fh == NULL || fh->size == 0) {
+               printk(KERN_DEFAULT "%s at %p is empty\n", caption, fh);
+               return;
+       }
+
+       printk(KERN_DEFAULT "%s at %p is %u bytes, crc: 0x%08x:\n",
+              caption, fh, fh->size, _nfs_display_fhandle_hash(fh));
+       for (i = 0; i < fh->size; i += 16) {
+               __be32 *pos = (__be32 *)&fh->data[i];
+
+               switch ((fh->size - i - 1) >> 2) {
+               case 0:
+                       printk(KERN_DEFAULT " %08x\n",
+                               be32_to_cpup(pos));
+                       break;
+               case 1:
+                       printk(KERN_DEFAULT " %08x %08x\n",
+                               be32_to_cpup(pos), be32_to_cpup(pos + 1));
+                       break;
+               case 2:
+                       printk(KERN_DEFAULT " %08x %08x %08x\n",
+                               be32_to_cpup(pos), be32_to_cpup(pos + 1),
+                               be32_to_cpup(pos + 2));
+                       break;
+               default:
+                       printk(KERN_DEFAULT " %08x %08x %08x %08x\n",
+                               be32_to_cpup(pos), be32_to_cpup(pos + 1),
+                               be32_to_cpup(pos + 2), be32_to_cpup(pos + 3));
+               }
+       }
+}
+#endif
+
  /**
   * nfs_inode_attrs_need_update - check if the inode attributes need updating
   * @inode - pointer to inode
@@ -1211,8 +1275,9 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
         unsigned long now = jiffies;
         unsigned long save_cache_validity;
  
-       dfprintk(VFS, "NFS: %s(%s/%ld ct=%d info=0x%x)\n",
+       dfprintk(VFS, "NFS: %s(%s/%ld fh_crc=0x%08x ct=%d info=0x%x)\n",
                         __func__, inode->i_sb->s_id, inode->i_ino,
+                       nfs_display_fhandle_hash(NFS_FH(inode)),
                         atomic_read(&inode->i_count), fattr->valid);
  
         if ((fattr->valid & NFS_ATTR_FATTR_FILEID) && nfsi->fileid != fattr->fileid)
@@ -1406,7 +1471,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
         /*
          * Big trouble! The inode has become a different object.
          */
-       printk(KERN_DEBUG "%s: inode %ld mode changed, %07o to %07o\n",
+       printk(KERN_DEBUG "NFS: %s: inode %ld mode changed, %07o to %07o\n",
                         __func__, inode->i_ino, inode->i_mode, fattr->mode);
   out_err:
         /*
@@ -1495,7 +1560,7 @@ static void init_once(void *foo)
         INIT_LIST_HEAD(&nfsi->open_files);
         INIT_LIST_HEAD(&nfsi->access_cache_entry_lru);
         INIT_LIST_HEAD(&nfsi->access_cache_inode_lru);
-       INIT_RADIX_TREE(&nfsi->nfs_page_tree, GFP_ATOMIC);
+       INIT_LIST_HEAD(&nfsi->commit_list);
         nfsi->npages = 0;
         nfsi->ncommit = 0;
         atomic_set(&nfsi->silly_count, 1);
@@ -1552,6 +1617,28 @@ static void nfsiod_stop(void)
         destroy_workqueue(wq);
  }
  
+int nfs_net_id;
+EXPORT_SYMBOL_GPL(nfs_net_id);
+
+static int nfs_net_init(struct net *net)
+{
+       nfs_clients_init(net);
+       return nfs_dns_resolver_cache_init(net);
+}
+
+static void nfs_net_exit(struct net *net)
+{
+       nfs_dns_resolver_cache_destroy(net);
+       nfs_cleanup_cb_ident_idr(net);
+}
+
+static struct pernet_operations nfs_net_ops = {
+       .init = nfs_net_init,
+       .exit = nfs_net_exit,
+       .id   = &nfs_net_id,
+       .size = sizeof(struct nfs_net),
+};
+
  /*
   * Initialize NFS
   */
@@ -1561,9 +1648,13 @@ static int __init init_nfs_fs(void)
  
         err = nfs_idmap_init();
         if (err < 0)
-               goto out9;
+               goto out10;
  
         err = nfs_dns_resolver_init();
+       if (err < 0)
+               goto out9;
+
+       err = register_pernet_subsys(&nfs_net_ops);
         if (err < 0)
                 goto out8;
  
@@ -1600,14 +1691,14 @@ static int __init init_nfs_fs(void)
                 goto out0;
  
  #ifdef CONFIG_PROC_FS
-       rpc_proc_register(&nfs_rpcstat);
+       rpc_proc_register(&init_net, &nfs_rpcstat);
  #endif
         if ((err = register_nfs_fs()) != 0)
                 goto out;
         return 0;
  out:
  #ifdef CONFIG_PROC_FS
-       rpc_proc_unregister("nfs");
+       rpc_proc_unregister(&init_net, "nfs");
  #endif
         nfs_destroy_directcache();
  out0:
@@ -1625,10 +1716,12 @@ out5:
  out6:
         nfs_fscache_unregister();
  out7:
-       nfs_dns_resolver_destroy();
+       unregister_pernet_subsys(&nfs_net_ops);
  out8:
-       nfs_idmap_quit();
+       nfs_dns_resolver_destroy();
  out9:
+       nfs_idmap_quit();
+out10:
         return err;
  }
  
@@ -1640,12 +1733,12 @@ static void __exit exit_nfs_fs(void)
         nfs_destroy_inodecache();
         nfs_destroy_nfspagecache();
         nfs_fscache_unregister();
+       unregister_pernet_subsys(&nfs_net_ops);
         nfs_dns_resolver_destroy();
         nfs_idmap_quit();
  #ifdef CONFIG_PROC_FS
-       rpc_proc_unregister("nfs");
+       rpc_proc_unregister(&init_net, "nfs");
  #endif
-       nfs_cleanup_cb_ident_idr();
         unregister_nfs_fs();
         nfs_fs_proc_exit();
         nfsiod_stop();
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h

index 8102db9b926c2eb56d9035126fa27b33058d711f..2476dc69365f223d78a0b514991bcb88fa144ec9 100644 (file)
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -123,6 +123,7 @@ struct nfs_parsed_mount_data {
         } nfs_server;
  
         struct security_mnt_opts lsm_opts;
+       struct net              *net;
  };
  
  /* mount_clnt.c */
@@ -137,20 +138,22 @@ struct nfs_mount_request {
         int                     noresvport;
         unsigned int            *auth_flav_len;
         rpc_authflavor_t        *auth_flavs;
+       struct net              *net;
  };
  
  extern int nfs_mount(struct nfs_mount_request *info);
  extern void nfs_umount(const struct nfs_mount_request *info);
  
  /* client.c */
-extern struct rpc_program nfs_program;
+extern const struct rpc_program nfs_program;
+extern void nfs_clients_init(struct net *net);
  
-extern void nfs_cleanup_cb_ident_idr(void);
+extern void nfs_cleanup_cb_ident_idr(struct net *);
  extern void nfs_put_client(struct nfs_client *);
-extern struct nfs_client *nfs4_find_client_no_ident(const struct sockaddr *);
-extern struct nfs_client *nfs4_find_client_ident(int);
+extern struct nfs_client *nfs4_find_client_ident(struct net *, int);
  extern struct nfs_client *
-nfs4_find_client_sessionid(const struct sockaddr *, struct nfs4_sessionid *);
+nfs4_find_client_sessionid(struct net *, const struct sockaddr *,
+                               struct nfs4_sessionid *);
  extern struct nfs_server *nfs_create_server(
                                         const struct nfs_parsed_mount_data *,
                                         struct nfs_fh *);
@@ -329,6 +332,8 @@ void nfs_retry_commit(struct list_head *page_list,
  void nfs_commit_clear_lock(struct nfs_inode *nfsi);
  void nfs_commitdata_release(void *data);
  void nfs_commit_release_pages(struct nfs_write_data *data);
+void nfs_request_add_commit_list(struct nfs_page *req, struct list_head *head);
+void nfs_request_remove_commit_list(struct nfs_page *req);
  
  #ifdef CONFIG_MIGRATION
  extern int nfs_migrate_page(struct address_space *,
diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c

index d4c2d6b7507e791044df2837c1d0803699a2a518..8e65c7f1f87c526707959c0e691e36532406d1fa 100644 (file)
--- a/fs/nfs/mount_clnt.c
+++ b/fs/nfs/mount_clnt.c
@@ -16,7 +16,7 @@
  #include <linux/nfs_fs.h>
  #include "internal.h"
  
-#ifdef RPC_DEBUG
+#ifdef NFS_DEBUG
  # define NFSDBG_FACILITY       NFSDBG_MOUNT
  #endif
  
@@ -67,7 +67,7 @@ enum {
         MOUNTPROC3_EXPORT       = 5,
  };
  
-static struct rpc_program      mnt_program;
+static const struct rpc_program mnt_program;
  
  /*
   * Defined by OpenGroup XNFS Version 3W, chapter 8
@@ -153,7 +153,7 @@ int nfs_mount(struct nfs_mount_request *info)
                 .rpc_resp       = &result,
         };
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = info->net,
                 .protocol       = info->protocol,
                 .address        = info->sap,
                 .addrsize       = info->salen,
@@ -225,7 +225,7 @@ void nfs_umount(const struct nfs_mount_request *info)
                 .to_retries = 2,
         };
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = info->net,
                 .protocol       = IPPROTO_UDP,
                 .address        = info->sap,
                 .addrsize       = info->salen,
@@ -488,19 +488,19 @@ static struct rpc_procinfo mnt3_procedures[] = {
  };
  
  
-static struct rpc_version mnt_version1 = {
+static const struct rpc_version mnt_version1 = {
         .number         = 1,
         .nrprocs        = ARRAY_SIZE(mnt_procedures),
         .procs          = mnt_procedures,
  };
  
-static struct rpc_version mnt_version3 = {
+static const struct rpc_version mnt_version3 = {
         .number         = 3,
         .nrprocs        = ARRAY_SIZE(mnt3_procedures),
         .procs          = mnt3_procedures,
  };
  
-static struct rpc_version *mnt_version[] = {
+static const struct rpc_version *mnt_version[] = {
         NULL,
         &mnt_version1,
         NULL,
@@ -509,7 +509,7 @@ static struct rpc_version *mnt_version[] = {
  
  static struct rpc_stat mnt_stats;
  
-static struct rpc_program mnt_program = {
+static const struct rpc_program mnt_program = {
         .name           = "mount",
         .number         = NFS_MNT_PROGRAM,
         .nrvers         = ARRAY_SIZE(mnt_version),
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c

index 8102391bb3744077ae778af2d5bdb5d135ed3fbf..1807866bb3ab845098de2a95c695bee5460aaf37 100644 (file)
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -276,7 +276,10 @@ out:
         nfs_free_fattr(fattr);
         nfs_free_fhandle(fh);
  out_nofree:
-       dprintk("<-- nfs_follow_mountpoint() = %p\n", mnt);
+       if (IS_ERR(mnt))
+               dprintk("<-- %s(): error %ld\n", __func__, PTR_ERR(mnt));
+       else
+               dprintk("<-- %s() = %p\n", __func__, mnt);
         return mnt;
  }
  
diff --git a/fs/nfs/netns.h b/fs/nfs/netns.h

new file mode 100644 (file)

index 0000000..aa14ec3
--- /dev/null
+++ b/fs/nfs/netns.h
@@ -0,0 +1,27 @@
+#ifndef __NFS_NETNS_H__
+#define __NFS_NETNS_H__
+
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
+
+struct bl_dev_msg {
+       int32_t status;
+       uint32_t major, minor;
+};
+
+struct nfs_net {
+       struct cache_detail *nfs_dns_resolve;
+       struct rpc_pipe *bl_device_pipe;
+       struct bl_dev_msg bl_mount_reply;
+       wait_queue_head_t bl_wq;
+       struct list_head nfs_client_list;
+       struct list_head nfs_volume_list;
+#ifdef CONFIG_NFS_V4
+       struct idr cb_ident_idr; /* Protected by nfs_client_lock */
+#endif
+       spinlock_t nfs_client_lock;
+};
+
+extern int nfs_net_id;
+
+#endif
diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c

index 792cb13a430425c522e662b1e4915ffe7a893e9d..1f56000fabbdc1b1283961e340661e20e986e3bf 100644 (file)
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -1150,7 +1150,7 @@ struct rpc_procinfo       nfs_procedures[] = {
         PROC(STATFS,    fhandle,        statfsres,      0),
  };
  
-struct rpc_version             nfs_version2 = {
+const struct rpc_version nfs_version2 = {
         .number                 = 2,
         .nrprocs                = ARRAY_SIZE(nfs_procedures),
         .procs                  = nfs_procedures
diff --git a/fs/nfs/nfs3acl.c b/fs/nfs/nfs3acl.c

index 7ef23979896dd2cffdbc43aa8b0a5a9e3aaecb09..e4498dc351a834fcc35722645abfa50eaefa80fd 100644 (file)
--- a/fs/nfs/nfs3acl.c
+++ b/fs/nfs/nfs3acl.c
@@ -192,7 +192,7 @@ struct posix_acl *nfs3_proc_getacl(struct inode *inode, int type)
                 .pages = pages,
         };
         struct nfs3_getaclres res = {
-               0
+               NULL,
         };
         struct rpc_message msg = {
                 .rpc_argp       = &args,
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c

index 91943953a3703edb21e30e96b2b01655d3fa3234..5242eae6711a0b5323b1dcec13d57ecf01c47986 100644 (file)
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -428,6 +428,11 @@ nfs3_proc_unlink_setup(struct rpc_message *msg, struct inode *dir)
         msg->rpc_proc = &nfs3_procedures[NFS3PROC_REMOVE];
  }
  
+static void nfs3_proc_unlink_rpc_prepare(struct rpc_task *task, struct nfs_unlinkdata *data)
+{
+       rpc_call_start(task);
+}
+
  static int
  nfs3_proc_unlink_done(struct rpc_task *task, struct inode *dir)
  {
@@ -445,6 +450,11 @@ nfs3_proc_rename_setup(struct rpc_message *msg, struct inode *dir)
         msg->rpc_proc = &nfs3_procedures[NFS3PROC_RENAME];
  }
  
+static void nfs3_proc_rename_rpc_prepare(struct rpc_task *task, struct nfs_renamedata *data)
+{
+       rpc_call_start(task);
+}
+
  static int
  nfs3_proc_rename_done(struct rpc_task *task, struct inode *old_dir,
                       struct inode *new_dir)
@@ -814,6 +824,11 @@ static void nfs3_proc_read_setup(struct nfs_read_data *data, struct rpc_message
         msg->rpc_proc = &nfs3_procedures[NFS3PROC_READ];
  }
  
+static void nfs3_proc_read_rpc_prepare(struct rpc_task *task, struct nfs_read_data *data)
+{
+       rpc_call_start(task);
+}
+
  static int nfs3_write_done(struct rpc_task *task, struct nfs_write_data *data)
  {
         if (nfs3_async_handle_jukebox(task, data->inode))
@@ -828,6 +843,11 @@ static void nfs3_proc_write_setup(struct nfs_write_data *data, struct rpc_messag
         msg->rpc_proc = &nfs3_procedures[NFS3PROC_WRITE];
  }
  
+static void nfs3_proc_write_rpc_prepare(struct rpc_task *task, struct nfs_write_data *data)
+{
+       rpc_call_start(task);
+}
+
  static int nfs3_commit_done(struct rpc_task *task, struct nfs_write_data *data)
  {
         if (nfs3_async_handle_jukebox(task, data->inode))
@@ -864,9 +884,11 @@ const struct nfs_rpc_ops nfs_v3_clientops = {
         .create         = nfs3_proc_create,
         .remove         = nfs3_proc_remove,
         .unlink_setup   = nfs3_proc_unlink_setup,
+       .unlink_rpc_prepare = nfs3_proc_unlink_rpc_prepare,
         .unlink_done    = nfs3_proc_unlink_done,
         .rename         = nfs3_proc_rename,
         .rename_setup   = nfs3_proc_rename_setup,
+       .rename_rpc_prepare = nfs3_proc_rename_rpc_prepare,
         .rename_done    = nfs3_proc_rename_done,
         .link           = nfs3_proc_link,
         .symlink        = nfs3_proc_symlink,
@@ -879,8 +901,10 @@ const struct nfs_rpc_ops nfs_v3_clientops = {
         .pathconf       = nfs3_proc_pathconf,
         .decode_dirent  = nfs3_decode_dirent,
         .read_setup     = nfs3_proc_read_setup,
+       .read_rpc_prepare = nfs3_proc_read_rpc_prepare,
         .read_done      = nfs3_read_done,
         .write_setup    = nfs3_proc_write_setup,
+       .write_rpc_prepare = nfs3_proc_write_rpc_prepare,
         .write_done     = nfs3_write_done,
         .commit_setup   = nfs3_proc_commit_setup,
         .commit_done    = nfs3_commit_done,
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c

index 183c6b123d0f53bd43209cd6a6634aaa8b9f484f..a77cc9a3ce5561f1d8b23e78bb16ac49fcbf14b4 100644 (file)
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -2461,7 +2461,7 @@ struct rpc_procinfo       nfs3_procedures[] = {
         PROC(COMMIT,            commit,         commit,         5),
  };
  
-struct rpc_version             nfs_version3 = {
+const struct rpc_version nfs_version3 = {
         .number                 = 3,
         .nrprocs                = ARRAY_SIZE(nfs3_procedures),
         .procs                  = nfs3_procedures
@@ -2489,7 +2489,7 @@ static struct rpc_procinfo        nfs3_acl_procedures[] = {
         },
  };
  
-struct rpc_version             nfsacl_version3 = {
+const struct rpc_version nfsacl_version3 = {
         .number                 = 3,
         .nrprocs                = sizeof(nfs3_acl_procedures)/
                                   sizeof(nfs3_acl_procedures[0]),
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h

index 4d7d0aedc101831ecb3b10cf345f0e18e7ca56ad..97ecc863dd76b46900e23758d4cbdda2f28f63d0 100644 (file)
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -20,7 +20,6 @@ enum nfs4_client_state {
         NFS4CLNT_RECLAIM_REBOOT,
         NFS4CLNT_RECLAIM_NOGRACE,
         NFS4CLNT_DELEGRETURN,
-       NFS4CLNT_LAYOUTRECALL,
         NFS4CLNT_SESSION_RESET,
         NFS4CLNT_RECALL_SLOT,
         NFS4CLNT_LEASE_CONFIRM,
@@ -44,7 +43,7 @@ struct nfs4_minor_version_ops {
                         struct nfs4_sequence_args *args,
                         struct nfs4_sequence_res *res,
                         int cache_reply);
-       int     (*validate_stateid)(struct nfs_delegation *,
+       bool    (*match_stateid)(const nfs4_stateid *,
                         const nfs4_stateid *);
         int     (*find_root_sec)(struct nfs_server *, struct nfs_fh *,
                         struct nfs_fsinfo *);
@@ -53,26 +52,25 @@ struct nfs4_minor_version_ops {
         const struct nfs4_state_maintenance_ops *state_renewal_ops;
  };
  
-/*
- * struct rpc_sequence ensures that RPC calls are sent in the exact
- * order that they appear on the list.
- */
-struct rpc_sequence {
-       struct rpc_wait_queue   wait;   /* RPC call delay queue */
-       spinlock_t lock;                /* Protects the list */
-       struct list_head list;          /* Defines sequence of RPC calls */
+struct nfs_unique_id {
+       struct rb_node rb_node;
+       __u64 id;
  };
  
  #define NFS_SEQID_CONFIRMED 1
  struct nfs_seqid_counter {
-       struct rpc_sequence *sequence;
+       int owner_id;
         int flags;
         u32 counter;
+       spinlock_t lock;                /* Protects the list */
+       struct list_head list;          /* Defines sequence of RPC calls */
+       struct rpc_wait_queue   wait;   /* RPC call delay queue */
  };
  
  struct nfs_seqid {
         struct nfs_seqid_counter *sequence;
         struct list_head list;
+       struct rpc_task *task;
  };
  
  static inline void nfs_confirm_seqid(struct nfs_seqid_counter *seqid, int status)
@@ -81,18 +79,12 @@ static inline void nfs_confirm_seqid(struct nfs_seqid_counter *seqid, int status
                 seqid->flags |= NFS_SEQID_CONFIRMED;
  }
  
-struct nfs_unique_id {
-       struct rb_node rb_node;
-       __u64 id;
-};
-
  /*
   * NFS4 state_owners and lock_owners are simply labels for ordered
   * sequences of RPC calls. Their sole purpose is to provide once-only
   * semantics by allowing the server to identify replayed requests.
   */
  struct nfs4_state_owner {
-       struct nfs_unique_id so_owner_id;
         struct nfs_server    *so_server;
         struct list_head     so_lru;
         unsigned long        so_expires;
@@ -105,7 +97,6 @@ struct nfs4_state_owner {
         unsigned long        so_flags;
         struct list_head     so_states;
         struct nfs_seqid_counter so_seqid;
-       struct rpc_sequence  so_sequence;
  };
  
  enum {
@@ -146,8 +137,6 @@ struct nfs4_lock_state {
  #define NFS_LOCK_INITIALIZED 1
         int                     ls_flags;
         struct nfs_seqid_counter        ls_seqid;
-       struct rpc_sequence     ls_sequence;
-       struct nfs_unique_id    ls_id;
         nfs4_stateid            ls_stateid;
         atomic_t                ls_count;
         struct nfs4_lock_owner  ls_owner;
@@ -193,6 +182,7 @@ struct nfs4_exception {
         long timeout;
         int retry;
         struct nfs4_state *state;
+       struct inode *inode;
  };
  
  struct nfs4_state_recovery_ops {
@@ -224,7 +214,7 @@ extern int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait, boo
  extern int nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *fhandle);
  extern int nfs4_proc_fs_locations(struct inode *dir, const struct qstr *name,
                 struct nfs4_fs_locations *fs_locations, struct page *page);
-extern void nfs4_release_lockowner(const struct nfs4_lock_state *);
+extern int nfs4_release_lockowner(struct nfs4_lock_state *);
  extern const struct xattr_handler *nfs4_xattr_handlers[];
  
  #if defined(CONFIG_NFS_V4_1)
@@ -233,12 +223,13 @@ static inline struct nfs4_session *nfs4_get_session(const struct nfs_server *ser
         return server->nfs_client->cl_session;
  }
  
+extern bool nfs4_set_task_privileged(struct rpc_task *task, void *dummy);
  extern int nfs4_setup_sequence(const struct nfs_server *server,
                 struct nfs4_sequence_args *args, struct nfs4_sequence_res *res,
-               int cache_reply, struct rpc_task *task);
+               struct rpc_task *task);
  extern int nfs41_setup_sequence(struct nfs4_session *session,
                 struct nfs4_sequence_args *args, struct nfs4_sequence_res *res,
-               int cache_reply, struct rpc_task *task);
+               struct rpc_task *task);
  extern void nfs4_destroy_session(struct nfs4_session *session);
  extern struct nfs4_session *nfs4_alloc_session(struct nfs_client *clp);
  extern int nfs4_proc_create_session(struct nfs_client *);
@@ -269,7 +260,7 @@ static inline struct nfs4_session *nfs4_get_session(const struct nfs_server *ser
  
  static inline int nfs4_setup_sequence(const struct nfs_server *server,
                 struct nfs4_sequence_args *args, struct nfs4_sequence_res *res,
-               int cache_reply, struct rpc_task *task)
+               struct rpc_task *task)
  {
         return 0;
  }
@@ -319,7 +310,7 @@ static inline void nfs4_schedule_session_recovery(struct nfs4_session *session)
  }
  #endif /* CONFIG_NFS_V4_1 */
  
-extern struct nfs4_state_owner * nfs4_get_state_owner(struct nfs_server *, struct rpc_cred *);
+extern struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *, struct rpc_cred *, gfp_t);
  extern void nfs4_put_state_owner(struct nfs4_state_owner *);
  extern void nfs4_purge_state_owners(struct nfs_server *);
  extern struct nfs4_state * nfs4_get_open_state(struct inode *, struct nfs4_state_owner *);
@@ -327,6 +318,8 @@ extern void nfs4_put_open_state(struct nfs4_state *);
  extern void nfs4_close_state(struct nfs4_state *, fmode_t);
  extern void nfs4_close_sync(struct nfs4_state *, fmode_t);
  extern void nfs4_state_set_mode_locked(struct nfs4_state *, fmode_t);
+extern void nfs_inode_find_state_and_recover(struct inode *inode,
+               const nfs4_stateid *stateid);
  extern void nfs4_schedule_lease_recovery(struct nfs_client *);
  extern void nfs4_schedule_state_manager(struct nfs_client *);
  extern void nfs4_schedule_path_down_recovery(struct nfs_client *clp);
@@ -337,7 +330,8 @@ extern void nfs41_handle_server_scope(struct nfs_client *,
                                       struct server_scope **);
  extern void nfs4_put_lock_state(struct nfs4_lock_state *lsp);
  extern int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl);
-extern void nfs4_copy_stateid(nfs4_stateid *, struct nfs4_state *, fl_owner_t, pid_t);
+extern void nfs4_select_rw_stateid(nfs4_stateid *, struct nfs4_state *,
+               fmode_t, fl_owner_t, pid_t);
  
  extern struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_mask);
  extern int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task);
@@ -346,6 +340,8 @@ extern void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid);
  extern void nfs_release_seqid(struct nfs_seqid *seqid);
  extern void nfs_free_seqid(struct nfs_seqid *seqid);
  
+extern void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp);
+
  extern const nfs4_stateid zero_stateid;
  
  /* nfs4xdr.c */
@@ -357,6 +353,16 @@ struct nfs4_mount_data;
  extern struct svc_version nfs4_callback_version1;
  extern struct svc_version nfs4_callback_version4;
  
+static inline void nfs4_stateid_copy(nfs4_stateid *dst, const nfs4_stateid *src)
+{
+       memcpy(dst, src, sizeof(*dst));
+}
+
+static inline bool nfs4_stateid_match(const nfs4_stateid *dst, const nfs4_stateid *src)
+{
+       return memcmp(dst, src, sizeof(*dst)) == 0;
+}
+
  #else
  
  #define nfs4_close_state(a, b) do { } while (0)
diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c

index 71ec08617e23820b8e3d11ee8e47a4217ee52e55..634c0bcb4fd6878776f9d560e8ad023199b975d4 100644 (file)
--- a/fs/nfs/nfs4filelayout.c
+++ b/fs/nfs/nfs4filelayout.c
@@ -33,7 +33,10 @@
  #include <linux/nfs_page.h>
  #include <linux/module.h>
  
+#include <linux/sunrpc/metrics.h>
+
  #include "internal.h"
+#include "delegation.h"
  #include "nfs4filelayout.h"
  
  #define NFSDBG_FACILITY         NFSDBG_PNFS_LD
@@ -84,12 +87,27 @@ static int filelayout_async_handle_error(struct rpc_task *task,
                                          struct nfs_client *clp,
                                          int *reset)
  {
+       struct nfs_server *mds_server = NFS_SERVER(state->inode);
+       struct nfs_client *mds_client = mds_server->nfs_client;
+
         if (task->tk_status >= 0)
                 return 0;
-
         *reset = 0;
  
         switch (task->tk_status) {
+       /* MDS state errors */
+       case -NFS4ERR_DELEG_REVOKED:
+       case -NFS4ERR_ADMIN_REVOKED:
+       case -NFS4ERR_BAD_STATEID:
+               nfs_remove_bad_delegation(state->inode);
+       case -NFS4ERR_OPENMODE:
+               nfs4_schedule_stateid_recovery(mds_server, state);
+               goto wait_on_recovery;
+       case -NFS4ERR_EXPIRED:
+               nfs4_schedule_stateid_recovery(mds_server, state);
+               nfs4_schedule_lease_recovery(mds_client);
+               goto wait_on_recovery;
+       /* DS session errors */
         case -NFS4ERR_BADSESSION:
         case -NFS4ERR_BADSLOT:
         case -NFS4ERR_BAD_HIGH_SLOT:
@@ -115,8 +133,14 @@ static int filelayout_async_handle_error(struct rpc_task *task,
                 *reset = 1;
                 break;
         }
+out:
         task->tk_status = 0;
         return -EAGAIN;
+wait_on_recovery:
+       rpc_sleep_on(&mds_client->cl_rpcwaitq, task, NULL);
+       if (test_bit(NFS4CLNT_MANAGER_RUNNING, &mds_client->cl_state) == 0)
+               rpc_wake_up_queued_task(&mds_client->cl_rpcwaitq, task);
+       goto out;
  }
  
  /* NFS_PROTO call done callback routines */
@@ -173,7 +197,7 @@ static void filelayout_read_prepare(struct rpc_task *task, void *data)
  
         if (nfs41_setup_sequence(rdata->ds_clp->cl_session,
                                 &rdata->args.seq_args, &rdata->res.seq_res,
-                               0, task))
+                               task))
                 return;
  
         rpc_call_start(task);
@@ -189,10 +213,18 @@ static void filelayout_read_call_done(struct rpc_task *task, void *data)
         rdata->mds_ops->rpc_call_done(task, data);
  }
  
+static void filelayout_read_count_stats(struct rpc_task *task, void *data)
+{
+       struct nfs_read_data *rdata = (struct nfs_read_data *)data;
+
+       rpc_count_iostats(task, NFS_SERVER(rdata->inode)->client->cl_metrics);
+}
+
  static void filelayout_read_release(void *data)
  {
         struct nfs_read_data *rdata = (struct nfs_read_data *)data;
  
+       put_lseg(rdata->lseg);
         rdata->mds_ops->rpc_release(data);
  }
  
@@ -254,7 +286,7 @@ static void filelayout_write_prepare(struct rpc_task *task, void *data)
  
         if (nfs41_setup_sequence(wdata->ds_clp->cl_session,
                                 &wdata->args.seq_args, &wdata->res.seq_res,
-                               0, task))
+                               task))
                 return;
  
         rpc_call_start(task);
@@ -268,10 +300,18 @@ static void filelayout_write_call_done(struct rpc_task *task, void *data)
         wdata->mds_ops->rpc_call_done(task, data);
  }
  
+static void filelayout_write_count_stats(struct rpc_task *task, void *data)
+{
+       struct nfs_write_data *wdata = (struct nfs_write_data *)data;
+
+       rpc_count_iostats(task, NFS_SERVER(wdata->inode)->client->cl_metrics);
+}
+
  static void filelayout_write_release(void *data)
  {
         struct nfs_write_data *wdata = (struct nfs_write_data *)data;
  
+       put_lseg(wdata->lseg);
         wdata->mds_ops->rpc_release(data);
  }
  
@@ -282,24 +322,28 @@ static void filelayout_commit_release(void *data)
         nfs_commit_release_pages(wdata);
         if (atomic_dec_and_test(&NFS_I(wdata->inode)->commits_outstanding))
                 nfs_commit_clear_lock(NFS_I(wdata->inode));
+       put_lseg(wdata->lseg);
         nfs_commitdata_release(wdata);
  }
  
-struct rpc_call_ops filelayout_read_call_ops = {
+static const struct rpc_call_ops filelayout_read_call_ops = {
         .rpc_call_prepare = filelayout_read_prepare,
         .rpc_call_done = filelayout_read_call_done,
+       .rpc_count_stats = filelayout_read_count_stats,
         .rpc_release = filelayout_read_release,
  };
  
-struct rpc_call_ops filelayout_write_call_ops = {
+static const struct rpc_call_ops filelayout_write_call_ops = {
         .rpc_call_prepare = filelayout_write_prepare,
         .rpc_call_done = filelayout_write_call_done,
+       .rpc_count_stats = filelayout_write_count_stats,
         .rpc_release = filelayout_write_release,
  };
  
-struct rpc_call_ops filelayout_commit_call_ops = {
+static const struct rpc_call_ops filelayout_commit_call_ops = {
         .rpc_call_prepare = filelayout_write_prepare,
         .rpc_call_done = filelayout_write_call_done,
+       .rpc_count_stats = filelayout_write_count_stats,
         .rpc_release = filelayout_commit_release,
  };
  
@@ -367,7 +411,8 @@ filelayout_write_pagelist(struct nfs_write_data *data, int sync)
         idx = nfs4_fl_calc_ds_index(lseg, j);
         ds = nfs4_fl_prepare_ds(lseg, idx);
         if (!ds) {
-               printk(KERN_ERR "%s: prepare_ds failed, use MDS\n", __func__);
+               printk(KERN_ERR "NFS: %s: prepare_ds failed, use MDS\n",
+                       __func__);
                 set_bit(lo_fail_bit(IOMODE_RW), &lseg->pls_layout->plh_flags);
                 set_bit(lo_fail_bit(IOMODE_READ), &lseg->pls_layout->plh_flags);
                 return PNFS_NOT_ATTEMPTED;
@@ -575,7 +620,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
                         goto out_err_free;
                 fl->fh_array[i]->size = be32_to_cpup(p++);
                 if (sizeof(struct nfs_fh) < fl->fh_array[i]->size) {
-                       printk(KERN_ERR "Too big fh %d received %d\n",
+                       printk(KERN_ERR "NFS: Too big fh %d received %d\n",
                                i, fl->fh_array[i]->size);
                         goto out_err_free;
                 }
@@ -640,14 +685,16 @@ filelayout_alloc_lseg(struct pnfs_layout_hdr *layoutid,
                 int size = (fl->stripe_type == STRIPE_SPARSE) ?
                         fl->dsaddr->ds_num : fl->dsaddr->stripe_count;
  
-               fl->commit_buckets = kcalloc(size, sizeof(struct list_head), gfp_flags);
+               fl->commit_buckets = kcalloc(size, sizeof(struct nfs4_fl_commit_bucket), gfp_flags);
                 if (!fl->commit_buckets) {
                         filelayout_free_lseg(&fl->generic_hdr);
                         return NULL;
                 }
                 fl->number_of_buckets = size;
-               for (i = 0; i < size; i++)
-                       INIT_LIST_HEAD(&fl->commit_buckets[i]);
+               for (i = 0; i < size; i++) {
+                       INIT_LIST_HEAD(&fl->commit_buckets[i].written);
+                       INIT_LIST_HEAD(&fl->commit_buckets[i].committing);
+               }
         }
         return &fl->generic_hdr;
  }
@@ -679,7 +726,7 @@ filelayout_pg_test(struct nfs_pageio_descriptor *pgio, struct nfs_page *prev,
         return (p_stripe == r_stripe);
  }
  
-void
+static void
  filelayout_pg_init_read(struct nfs_pageio_descriptor *pgio,
                         struct nfs_page *req)
  {
@@ -696,7 +743,7 @@ filelayout_pg_init_read(struct nfs_pageio_descriptor *pgio,
                 nfs_pageio_reset_read_mds(pgio);
  }
  
-void
+static void
  filelayout_pg_init_write(struct nfs_pageio_descriptor *pgio,
                          struct nfs_page *req)
  {
@@ -725,11 +772,6 @@ static const struct nfs_pageio_ops filelayout_pg_write_ops = {
         .pg_doio = pnfs_generic_pg_writepages,
  };
  
-static bool filelayout_mark_pnfs_commit(struct pnfs_layout_segment *lseg)
-{
-       return !FILELAYOUT_LSEG(lseg)->commit_through_mds;
-}
-
  static u32 select_bucket_index(struct nfs4_filelayout_segment *fl, u32 j)
  {
         if (fl->stripe_type == STRIPE_SPARSE)
@@ -738,13 +780,49 @@ static u32 select_bucket_index(struct nfs4_filelayout_segment *fl, u32 j)
                 return j;
  }
  
-struct list_head *filelayout_choose_commit_list(struct nfs_page *req)
+/* The generic layer is about to remove the req from the commit list.
+ * If this will make the bucket empty, it will need to put the lseg reference.
+ */
+static void
+filelayout_clear_request_commit(struct nfs_page *req)
+{
+       struct pnfs_layout_segment *freeme = NULL;
+       struct inode *inode = req->wb_context->dentry->d_inode;
+
+       spin_lock(&inode->i_lock);
+       if (!test_and_clear_bit(PG_COMMIT_TO_DS, &req->wb_flags))
+               goto out;
+       if (list_is_singular(&req->wb_list)) {
+               struct inode *inode = req->wb_context->dentry->d_inode;
+               struct pnfs_layout_segment *lseg;
+
+               /* From here we can find the bucket, but for the moment,
+                * since there is only one relevant lseg...
+                */
+               list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list) {
+                       if (lseg->pls_range.iomode == IOMODE_RW) {
+                               freeme = lseg;
+                               break;
+                       }
+               }
+       }
+out:
+       nfs_request_remove_commit_list(req);
+       spin_unlock(&inode->i_lock);
+       put_lseg(freeme);
+}
+
+static struct list_head *
+filelayout_choose_commit_list(struct nfs_page *req,
+                             struct pnfs_layout_segment *lseg)
  {
-       struct pnfs_layout_segment *lseg = req->wb_commit_lseg;
         struct nfs4_filelayout_segment *fl = FILELAYOUT_LSEG(lseg);
         u32 i, j;
         struct list_head *list;
  
+       if (fl->commit_through_mds)
+               return &NFS_I(req->wb_context->dentry->d_inode)->commit_list;
+
         /* Note that we are calling nfs4_fl_calc_j_index on each page
          * that ends up being committed to a data server.  An attractive
          * alternative is to add a field to nfs_write_data and nfs_page
@@ -754,14 +832,30 @@ struct list_head *filelayout_choose_commit_list(struct nfs_page *req)
         j = nfs4_fl_calc_j_index(lseg,
                                  (loff_t)req->wb_index << PAGE_CACHE_SHIFT);
         i = select_bucket_index(fl, j);
-       list = &fl->commit_buckets[i];
+       list = &fl->commit_buckets[i].written;
         if (list_empty(list)) {
-               /* Non-empty buckets hold a reference on the lseg */
+               /* Non-empty buckets hold a reference on the lseg.  That ref
+                * is normally transferred to the COMMIT call and released
+                * there.  It could also be released if the last req is pulled
+                * off due to a rewrite, in which case it will be done in
+                * filelayout_remove_commit_req
+                */
                 get_lseg(lseg);
         }
+       set_bit(PG_COMMIT_TO_DS, &req->wb_flags);
         return list;
  }
  
+static void
+filelayout_mark_request_commit(struct nfs_page *req,
+               struct pnfs_layout_segment *lseg)
+{
+       struct list_head *list;
+
+       list = filelayout_choose_commit_list(req, lseg);
+       nfs_request_add_commit_list(req, list);
+}
+
  static u32 calc_ds_index_from_commit(struct pnfs_layout_segment *lseg, u32 i)
  {
         struct nfs4_filelayout_segment *flseg = FILELAYOUT_LSEG(lseg);
@@ -797,11 +891,12 @@ static int filelayout_initiate_commit(struct nfs_write_data *data, int how)
         idx = calc_ds_index_from_commit(lseg, data->ds_commit_index);
         ds = nfs4_fl_prepare_ds(lseg, idx);
         if (!ds) {
-               printk(KERN_ERR "%s: prepare_ds failed, use MDS\n", __func__);
+               printk(KERN_ERR "NFS: %s: prepare_ds failed, use MDS\n",
+                       __func__);
                 set_bit(lo_fail_bit(IOMODE_RW), &lseg->pls_layout->plh_flags);
                 set_bit(lo_fail_bit(IOMODE_READ), &lseg->pls_layout->plh_flags);
                 prepare_to_resend_writes(data);
-               data->mds_ops->rpc_release(data);
+               filelayout_commit_release(data);
                 return -EAGAIN;
         }
         dprintk("%s ino %lu, how %d\n", __func__, data->inode->i_ino, how);
@@ -817,24 +912,87 @@ static int filelayout_initiate_commit(struct nfs_write_data *data, int how)
  /*
   * This is only useful while we are using whole file layouts.
   */
-static struct pnfs_layout_segment *find_only_write_lseg(struct inode *inode)
+static struct pnfs_layout_segment *
+find_only_write_lseg_locked(struct inode *inode)
  {
-       struct pnfs_layout_segment *lseg, *rv = NULL;
+       struct pnfs_layout_segment *lseg;
  
-       spin_lock(&inode->i_lock);
         list_for_each_entry(lseg, &NFS_I(inode)->layout->plh_segs, pls_list)
                 if (lseg->pls_range.iomode == IOMODE_RW)
-                       rv = get_lseg(lseg);
+                       return lseg;
+       return NULL;
+}
+
+static struct pnfs_layout_segment *find_only_write_lseg(struct inode *inode)
+{
+       struct pnfs_layout_segment *rv;
+
+       spin_lock(&inode->i_lock);
+       rv = find_only_write_lseg_locked(inode);
+       if (rv)
+               get_lseg(rv);
         spin_unlock(&inode->i_lock);
         return rv;
  }
  
-static int alloc_ds_commits(struct inode *inode, struct list_head *list)
+static int
+filelayout_scan_ds_commit_list(struct nfs4_fl_commit_bucket *bucket, int max,
+               spinlock_t *lock)
+{
+       struct list_head *src = &bucket->written;
+       struct list_head *dst = &bucket->committing;
+       struct nfs_page *req, *tmp;
+       int ret = 0;
+
+       list_for_each_entry_safe(req, tmp, src, wb_list) {
+               if (!nfs_lock_request(req))
+                       continue;
+               if (cond_resched_lock(lock))
+                       list_safe_reset_next(req, tmp, wb_list);
+               nfs_request_remove_commit_list(req);
+               clear_bit(PG_COMMIT_TO_DS, &req->wb_flags);
+               nfs_list_add_request(req, dst);
+               ret++;
+               if (ret == max)
+                       break;
+       }
+       return ret;
+}
+
+/* Move reqs from written to committing lists, returning count of number moved.
+ * Note called with i_lock held.
+ */
+static int filelayout_scan_commit_lists(struct inode *inode, int max,
+               spinlock_t *lock)
+{
+       struct pnfs_layout_segment *lseg;
+       struct nfs4_filelayout_segment *fl;
+       int i, rv = 0, cnt;
+
+       lseg = find_only_write_lseg_locked(inode);
+       if (!lseg)
+               goto out_done;
+       fl = FILELAYOUT_LSEG(lseg);
+       if (fl->commit_through_mds)
+               goto out_done;
+       for (i = 0; i < fl->number_of_buckets && max != 0; i++) {
+               cnt = filelayout_scan_ds_commit_list(&fl->commit_buckets[i],
+                               max, lock);
+               max -= cnt;
+               rv += cnt;
+       }
+out_done:
+       return rv;
+}
+
+static unsigned int
+alloc_ds_commits(struct inode *inode, struct list_head *list)
  {
         struct pnfs_layout_segment *lseg;
         struct nfs4_filelayout_segment *fl;
         struct nfs_write_data *data;
         int i, j;
+       unsigned int nreq = 0;
  
         /* Won't need this when non-whole file layout segments are supported
          * instead we will use a pnfs_layout_hdr structure */
@@ -843,28 +1001,27 @@ static int alloc_ds_commits(struct inode *inode, struct list_head *list)
                 return 0;
         fl = FILELAYOUT_LSEG(lseg);
         for (i = 0; i < fl->number_of_buckets; i++) {
-               if (list_empty(&fl->commit_buckets[i]))
+               if (list_empty(&fl->commit_buckets[i].committing))
                         continue;
                 data = nfs_commitdata_alloc();
                 if (!data)
-                       goto out_bad;
+                       break;
                 data->ds_commit_index = i;
                 data->lseg = lseg;
                 list_add(&data->pages, list);
+               nreq++;
         }
-       put_lseg(lseg);
-       return 0;
  
-out_bad:
+       /* Clean up on error */
         for (j = i; j < fl->number_of_buckets; j++) {
-               if (list_empty(&fl->commit_buckets[i]))
+               if (list_empty(&fl->commit_buckets[i].committing))
                         continue;
-               nfs_retry_commit(&fl->commit_buckets[i], lseg);
+               nfs_retry_commit(&fl->commit_buckets[i].committing, lseg);
                 put_lseg(lseg);  /* associated with emptying bucket */
         }
         put_lseg(lseg);
         /* Caller will clean up entries put on list */
-       return -ENOMEM;
+       return nreq;
  }
  
  /* This follows nfs_commit_list pretty closely */
@@ -874,40 +1031,40 @@ filelayout_commit_pagelist(struct inode *inode, struct list_head *mds_pages,
  {
         struct nfs_write_data   *data, *tmp;
         LIST_HEAD(list);
+       unsigned int nreq = 0;
  
         if (!list_empty(mds_pages)) {
                 data = nfs_commitdata_alloc();
-               if (!data)
-                       goto out_bad;
-               data->lseg = NULL;
-               list_add(&data->pages, &list);
+               if (data != NULL) {
+                       data->lseg = NULL;
+                       list_add(&data->pages, &list);
+                       nreq++;
+               } else
+                       nfs_retry_commit(mds_pages, NULL);
         }
  
-       if (alloc_ds_commits(inode, &list))
-               goto out_bad;
+       nreq += alloc_ds_commits(inode, &list);
+
+       if (nreq == 0) {
+               nfs_commit_clear_lock(NFS_I(inode));
+               goto out;
+       }
+
+       atomic_add(nreq, &NFS_I(inode)->commits_outstanding);
  
         list_for_each_entry_safe(data, tmp, &list, pages) {
                 list_del_init(&data->pages);
-               atomic_inc(&NFS_I(inode)->commits_outstanding);
                 if (!data->lseg) {
                         nfs_init_commit(data, mds_pages, NULL);
                         nfs_initiate_commit(data, NFS_CLIENT(inode),
                                             data->mds_ops, how);
                 } else {
-                       nfs_init_commit(data, &FILELAYOUT_LSEG(data->lseg)->commit_buckets[data->ds_commit_index], data->lseg);
+                       nfs_init_commit(data, &FILELAYOUT_LSEG(data->lseg)->commit_buckets[data->ds_commit_index].committing, data->lseg);
                         filelayout_initiate_commit(data, how);
                 }
         }
-       return 0;
- out_bad:
-       list_for_each_entry_safe(data, tmp, &list, pages) {
-               nfs_retry_commit(&data->pages, data->lseg);
-               list_del_init(&data->pages);
-               nfs_commit_free(data);
-       }
-       nfs_retry_commit(mds_pages, NULL);
-       nfs_commit_clear_lock(NFS_I(inode));
-       return -ENOMEM;
+out:
+       return PNFS_ATTEMPTED;
  }
  
  static void
@@ -924,8 +1081,9 @@ static struct pnfs_layoutdriver_type filelayout_type = {
         .free_lseg              = filelayout_free_lseg,
         .pg_read_ops            = &filelayout_pg_read_ops,
         .pg_write_ops           = &filelayout_pg_write_ops,
-       .mark_pnfs_commit       = filelayout_mark_pnfs_commit,
-       .choose_commit_list     = filelayout_choose_commit_list,
+       .mark_request_commit    = filelayout_mark_request_commit,
+       .clear_request_commit   = filelayout_clear_request_commit,
+       .scan_commit_lists      = filelayout_scan_commit_lists,
         .commit_pagelist        = filelayout_commit_pagelist,
         .read_pagelist          = filelayout_read_pagelist,
         .write_pagelist         = filelayout_write_pagelist,
diff --git a/fs/nfs/nfs4filelayout.h b/fs/nfs/nfs4filelayout.h

index 2e42284253fa600ba9266afc6111711653558949..21190bb1f5e348c5549e5985afb8cdf896aa72dd 100644 (file)
--- a/fs/nfs/nfs4filelayout.h
+++ b/fs/nfs/nfs4filelayout.h
@@ -74,6 +74,11 @@ struct nfs4_file_layout_dsaddr {
         struct nfs4_pnfs_ds             *ds_list[1];
  };
  
+struct nfs4_fl_commit_bucket {
+       struct list_head written;
+       struct list_head committing;
+};
+
  struct nfs4_filelayout_segment {
         struct pnfs_layout_segment generic_hdr;
         u32 stripe_type;
@@ -84,7 +89,7 @@ struct nfs4_filelayout_segment {
         struct nfs4_file_layout_dsaddr *dsaddr; /* Point to GETDEVINFO data */
         unsigned int num_fh;
         struct nfs_fh **fh_array;
-       struct list_head *commit_buckets; /* Sort commits to ds */
+       struct nfs4_fl_commit_bucket *commit_buckets; /* Sort commits to ds */
         int number_of_buckets;
  };
  
diff --git a/fs/nfs/nfs4filelayoutdev.c b/fs/nfs/nfs4filelayoutdev.c

index 8ae91908f5aa6fa38c128348beb272ddb6226a3b..a866bbd2890a056b530ebe3ae91cd74b0862f4a1 100644 (file)
--- a/fs/nfs/nfs4filelayoutdev.c
+++ b/fs/nfs/nfs4filelayoutdev.c
@@ -45,7 +45,7 @@
   *   - incremented when a device id maps a data server already in the cache.
   *   - decremented when deviceid is removed from the cache.
   */
-DEFINE_SPINLOCK(nfs4_ds_cache_lock);
+static DEFINE_SPINLOCK(nfs4_ds_cache_lock);
  static LIST_HEAD(nfs4_data_server_cache);
  
  /* Debug routines */
@@ -108,58 +108,40 @@ same_sockaddr(struct sockaddr *addr1, struct sockaddr *addr2)
         return false;
  }
  
-/*
- * Lookup DS by addresses.  The first matching address returns true.
- * nfs4_ds_cache_lock is held
- */
-static struct nfs4_pnfs_ds *
-_data_server_lookup_locked(struct list_head *dsaddrs)
+static bool
+_same_data_server_addrs_locked(const struct list_head *dsaddrs1,
+                              const struct list_head *dsaddrs2)
  {
-       struct nfs4_pnfs_ds *ds;
         struct nfs4_pnfs_ds_addr *da1, *da2;
  
-       list_for_each_entry(da1, dsaddrs, da_node) {
-               list_for_each_entry(ds, &nfs4_data_server_cache, ds_node) {
-                       list_for_each_entry(da2, &ds->ds_addrs, da_node) {
-                               if (same_sockaddr(
-                                       (struct sockaddr *)&da1->da_addr,
-                                       (struct sockaddr *)&da2->da_addr))
-                                       return ds;
-                       }
-               }
+       /* step through both lists, comparing as we go */
+       for (da1 = list_first_entry(dsaddrs1, typeof(*da1), da_node),
+            da2 = list_first_entry(dsaddrs2, typeof(*da2), da_node);
+            da1 != NULL && da2 != NULL;
+            da1 = list_entry(da1->da_node.next, typeof(*da1), da_node),
+            da2 = list_entry(da2->da_node.next, typeof(*da2), da_node)) {
+               if (!same_sockaddr((struct sockaddr *)&da1->da_addr,
+                                  (struct sockaddr *)&da2->da_addr))
+                       return false;
         }
-       return NULL;
+       if (da1 == NULL && da2 == NULL)
+               return true;
+
+       return false;
  }
  
  /*
- * Compare two lists of addresses.
+ * Lookup DS by addresses.  nfs4_ds_cache_lock is held
   */
-static bool
-_data_server_match_all_addrs_locked(struct list_head *dsaddrs1,
-                                   struct list_head *dsaddrs2)
+static struct nfs4_pnfs_ds *
+_data_server_lookup_locked(const struct list_head *dsaddrs)
  {
-       struct nfs4_pnfs_ds_addr *da1, *da2;
-       size_t count1 = 0,
-              count2 = 0;
-
-       list_for_each_entry(da1, dsaddrs1, da_node)
-               count1++;
-
-       list_for_each_entry(da2, dsaddrs2, da_node) {
-               bool found = false;
-               count2++;
-               list_for_each_entry(da1, dsaddrs1, da_node) {
-                       if (same_sockaddr((struct sockaddr *)&da1->da_addr,
-                               (struct sockaddr *)&da2->da_addr)) {
-                               found = true;
-                               break;
-                       }
-               }
-               if (!found)
-                       return false;
-       }
+       struct nfs4_pnfs_ds *ds;
  
-       return (count1 == count2);
+       list_for_each_entry(ds, &nfs4_data_server_cache, ds_node)
+               if (_same_data_server_addrs_locked(&ds->ds_addrs, dsaddrs))
+                       return ds;
+       return NULL;
  }
  
  /*
@@ -356,11 +338,6 @@ nfs4_pnfs_ds_add(struct list_head *dsaddrs, gfp_t gfp_flags)
                 dprintk("%s add new data server %s\n", __func__,
                         ds->ds_remotestr);
         } else {
-               if (!_data_server_match_all_addrs_locked(&tmp_ds->ds_addrs,
-                                                        dsaddrs)) {
-                       dprintk("%s:  multipath address mismatch: %s != %s",
-                               __func__, tmp_ds->ds_remotestr, remotestr);
-               }
                 kfree(remotestr);
                 kfree(ds);
                 atomic_inc(&tmp_ds->ds_count);
@@ -378,7 +355,7 @@ out:
   * Currently only supports ipv4, ipv6 and one multi-path address.
   */
  static struct nfs4_pnfs_ds_addr *
-decode_ds_addr(struct xdr_stream *streamp, gfp_t gfp_flags)
+decode_ds_addr(struct net *net, struct xdr_stream *streamp, gfp_t gfp_flags)
  {
         struct nfs4_pnfs_ds_addr *da = NULL;
         char *buf, *portstr;
@@ -457,7 +434,7 @@ decode_ds_addr(struct xdr_stream *streamp, gfp_t gfp_flags)
  
         INIT_LIST_HEAD(&da->da_node);
  
-       if (!rpc_pton(buf, portstr-buf, (struct sockaddr *)&da->da_addr,
+       if (!rpc_pton(net, buf, portstr-buf, (struct sockaddr *)&da->da_addr,
                       sizeof(da->da_addr))) {
                 dprintk("%s: error parsing address %s\n", __func__, buf);
                 goto out_free_da;
@@ -554,7 +531,7 @@ decode_device(struct inode *ino, struct pnfs_device *pdev, gfp_t gfp_flags)
         cnt = be32_to_cpup(p);
         dprintk("%s stripe count  %d\n", __func__, cnt);
         if (cnt > NFS4_PNFS_MAX_STRIPE_CNT) {
-               printk(KERN_WARNING "%s: stripe count %d greater than "
+               printk(KERN_WARNING "NFS: %s: stripe count %d greater than "
                        "supported maximum %d\n", __func__,
                         cnt, NFS4_PNFS_MAX_STRIPE_CNT);
                 goto out_err_free_scratch;
@@ -585,7 +562,7 @@ decode_device(struct inode *ino, struct pnfs_device *pdev, gfp_t gfp_flags)
         num = be32_to_cpup(p);
         dprintk("%s ds_num %u\n", __func__, num);
         if (num > NFS4_PNFS_MAX_MULTI_CNT) {
-               printk(KERN_WARNING "%s: multipath count %d greater than "
+               printk(KERN_WARNING "NFS: %s: multipath count %d greater than "
                         "supported maximum %d\n", __func__,
                         num, NFS4_PNFS_MAX_MULTI_CNT);
                 goto out_err_free_stripe_indices;
@@ -593,7 +570,7 @@ decode_device(struct inode *ino, struct pnfs_device *pdev, gfp_t gfp_flags)
  
         /* validate stripe indices are all < num */
         if (max_stripe_index >= num) {
-               printk(KERN_WARNING "%s: stripe index %u >= num ds %u\n",
+               printk(KERN_WARNING "NFS: %s: stripe index %u >= num ds %u\n",
                         __func__, max_stripe_index, num);
                 goto out_err_free_stripe_indices;
         }
@@ -625,7 +602,8 @@ decode_device(struct inode *ino, struct pnfs_device *pdev, gfp_t gfp_flags)
  
                 mp_count = be32_to_cpup(p); /* multipath count */
                 for (j = 0; j < mp_count; j++) {
-                       da = decode_ds_addr(&stream, gfp_flags);
+                       da = decode_ds_addr(NFS_SERVER(ino)->nfs_client->net,
+                                           &stream, gfp_flags);
                         if (da)
                                 list_add_tail(&da->da_node, &dsaddrs);
                 }
@@ -686,7 +664,7 @@ decode_and_add_device(struct inode *inode, struct pnfs_device *dev, gfp_t gfp_fl
  
         new = decode_device(inode, dev, gfp_flags);
         if (!new) {
-               printk(KERN_WARNING "%s: Could not decode or add device\n",
+               printk(KERN_WARNING "NFS: %s: Could not decode or add device\n",
                         __func__);
                 return NULL;
         }
@@ -835,7 +813,7 @@ nfs4_fl_prepare_ds(struct pnfs_layout_segment *lseg, u32 ds_idx)
         struct nfs4_pnfs_ds *ds = dsaddr->ds_list[ds_idx];
  
         if (ds == NULL) {
-               printk(KERN_ERR "%s: No data server for offset index %d\n",
+               printk(KERN_ERR "NFS: %s: No data server for offset index %d\n",
                         __func__, ds_idx);
                 return NULL;
         }
diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c

index bb80c49b6533b44aab58171dcd69397ff02f39a2..9c8eca315f431199aa481c0eabc6ae3e044f8267 100644 (file)
--- a/fs/nfs/nfs4namespace.c
+++ b/fs/nfs/nfs4namespace.c
@@ -94,13 +94,14 @@ static int nfs4_validate_fspath(struct dentry *dentry,
  }
  
  static size_t nfs_parse_server_name(char *string, size_t len,
-               struct sockaddr *sa, size_t salen)
+               struct sockaddr *sa, size_t salen, struct nfs_server *server)
  {
+       struct net *net = rpc_net_ns(server->client);
         ssize_t ret;
  
-       ret = rpc_pton(string, len, sa, salen);
+       ret = rpc_pton(net, string, len, sa, salen);
         if (ret == 0) {
-               ret = nfs_dns_resolve_name(string, len, sa, salen);
+               ret = nfs_dns_resolve_name(net, string, len, sa, salen);
                 if (ret < 0)
                         ret = 0;
         }
@@ -137,7 +138,8 @@ static struct vfsmount *try_location(struct nfs_clone_mount *mountdata,
                         continue;
  
                 mountdata->addrlen = nfs_parse_server_name(buf->data, buf->len,
-                               mountdata->addr, addr_bufsize);
+                               mountdata->addr, addr_bufsize,
+                               NFS_SB(mountdata->sb));
                 if (mountdata->addrlen == 0)
                         continue;
  
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c

index caf92d05c3a901adabe5117a895107952e5cff23..e809d2305ebf3a6431c52a6f0a805672365557fb 100644 (file)
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -72,18 +72,21 @@
  
  #define NFS4_MAX_LOOP_ON_RECOVER (10)
  
+static unsigned short max_session_slots = NFS4_DEF_SLOT_TABLE_SIZE;
+
  struct nfs4_opendata;
  static int _nfs4_proc_open(struct nfs4_opendata *data);
  static int _nfs4_recover_proc_open(struct nfs4_opendata *data);
  static int nfs4_do_fsinfo(struct nfs_server *, struct nfs_fh *, struct nfs_fsinfo *);
  static int nfs4_async_handle_error(struct rpc_task *, const struct nfs_server *, struct nfs4_state *);
+static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr);
  static int _nfs4_proc_getattr(struct nfs_server *server, struct nfs_fh *fhandle, struct nfs_fattr *fattr);
  static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
                             struct nfs_fattr *fattr, struct iattr *sattr,
                             struct nfs4_state *state);
  #ifdef CONFIG_NFS_V4_1
-static int nfs41_test_stateid(struct nfs_server *, struct nfs4_state *);
-static int nfs41_free_stateid(struct nfs_server *, struct nfs4_state *);
+static int nfs41_test_stateid(struct nfs_server *, nfs4_stateid *);
+static int nfs41_free_stateid(struct nfs_server *, nfs4_stateid *);
  #endif
  /* Prevent leaks of NFSv4 errors into userland */
  static int nfs4_map_errors(int err)
@@ -259,15 +262,28 @@ static int nfs4_handle_exception(struct nfs_server *server, int errorcode, struc
  {
         struct nfs_client *clp = server->nfs_client;
         struct nfs4_state *state = exception->state;
+       struct inode *inode = exception->inode;
         int ret = errorcode;
  
         exception->retry = 0;
         switch(errorcode) {
                 case 0:
                         return 0;
+               case -NFS4ERR_OPENMODE:
+                       if (nfs_have_delegation(inode, FMODE_READ)) {
+                               nfs_inode_return_delegation(inode);
+                               exception->retry = 1;
+                               return 0;
+                       }
+                       if (state == NULL)
+                               break;
+                       nfs4_schedule_stateid_recovery(server, state);
+                       goto wait_on_recovery;
+               case -NFS4ERR_DELEG_REVOKED:
                 case -NFS4ERR_ADMIN_REVOKED:
                 case -NFS4ERR_BAD_STATEID:
-               case -NFS4ERR_OPENMODE:
+                       if (state != NULL)
+                               nfs_remove_bad_delegation(state->inode);
                         if (state == NULL)
                                 break;
                         nfs4_schedule_stateid_recovery(server, state);
@@ -360,16 +376,14 @@ static void renew_lease(const struct nfs_server *server, unsigned long timestamp
   * When updating highest_used_slotid there may be "holes" in the bitmap
   * so we need to scan down from highest_used_slotid to 0 looking for the now
   * highest slotid in use.
- * If none found, highest_used_slotid is set to -1.
+ * If none found, highest_used_slotid is set to NFS4_NO_SLOT.
   *
   * Must be called while holding tbl->slot_tbl_lock
   */
  static void
-nfs4_free_slot(struct nfs4_slot_table *tbl, u8 free_slotid)
+nfs4_free_slot(struct nfs4_slot_table *tbl, u32 slotid)
  {
-       int slotid = free_slotid;
-
-       BUG_ON(slotid < 0 || slotid >= NFS4_MAX_SLOT_TABLE);
+       BUG_ON(slotid >= NFS4_MAX_SLOT_TABLE);
         /* clear used bit in bitmap */
         __clear_bit(slotid, tbl->used_slots);
  
@@ -379,10 +393,16 @@ nfs4_free_slot(struct nfs4_slot_table *tbl, u8 free_slotid)
                 if (slotid < tbl->max_slots)
                         tbl->highest_used_slotid = slotid;
                 else
-                       tbl->highest_used_slotid = -1;
+                       tbl->highest_used_slotid = NFS4_NO_SLOT;
         }
-       dprintk("%s: free_slotid %u highest_used_slotid %d\n", __func__,
-               free_slotid, tbl->highest_used_slotid);
+       dprintk("%s: slotid %u highest_used_slotid %d\n", __func__,
+               slotid, tbl->highest_used_slotid);
+}
+
+bool nfs4_set_task_privileged(struct rpc_task *task, void *dummy)
+{
+       rpc_task_set_priority(task, RPC_PRIORITY_PRIVILEGED);
+       return true;
  }
  
  /*
@@ -390,16 +410,13 @@ nfs4_free_slot(struct nfs4_slot_table *tbl, u8 free_slotid)
   */
  static void nfs4_check_drain_fc_complete(struct nfs4_session *ses)
  {
-       struct rpc_task *task;
-
         if (!test_bit(NFS4_SESSION_DRAINING, &ses->session_state)) {
-               task = rpc_wake_up_next(&ses->fc_slot_table.slot_tbl_waitq);
-               if (task)
-                       rpc_task_set_priority(task, RPC_PRIORITY_PRIVILEGED);
+               rpc_wake_up_first(&ses->fc_slot_table.slot_tbl_waitq,
+                               nfs4_set_task_privileged, NULL);
                 return;
         }
  
-       if (ses->fc_slot_table.highest_used_slotid != -1)
+       if (ses->fc_slot_table.highest_used_slotid != NFS4_NO_SLOT)
                 return;
  
         dprintk("%s COMPLETE: Session Fore Channel Drained\n", __func__);
@@ -412,7 +429,7 @@ static void nfs4_check_drain_fc_complete(struct nfs4_session *ses)
  void nfs4_check_drain_bc_complete(struct nfs4_session *ses)
  {
         if (!test_bit(NFS4_SESSION_DRAINING, &ses->session_state) ||
-           ses->bc_slot_table.highest_used_slotid != -1)
+           ses->bc_slot_table.highest_used_slotid != NFS4_NO_SLOT)
                 return;
         dprintk("%s COMPLETE: Session Back Channel Drained\n", __func__);
         complete(&ses->bc_slot_table.complete);
@@ -507,25 +524,25 @@ static int nfs4_sequence_done(struct rpc_task *task,
   * nfs4_find_slot looks for an unset bit in the used_slots bitmap.
   * If found, we mark the slot as used, update the highest_used_slotid,
   * and respectively set up the sequence operation args.
- * The slot number is returned if found, or NFS4_MAX_SLOT_TABLE otherwise.
+ * The slot number is returned if found, or NFS4_NO_SLOT otherwise.
   *
   * Note: must be called with under the slot_tbl_lock.
   */
-static u8
+static u32
  nfs4_find_slot(struct nfs4_slot_table *tbl)
  {
-       int slotid;
-       u8 ret_id = NFS4_MAX_SLOT_TABLE;
-       BUILD_BUG_ON((u8)NFS4_MAX_SLOT_TABLE != (int)NFS4_MAX_SLOT_TABLE);
+       u32 slotid;
+       u32 ret_id = NFS4_NO_SLOT;
  
-       dprintk("--> %s used_slots=%04lx highest_used=%d max_slots=%d\n",
+       dprintk("--> %s used_slots=%04lx highest_used=%u max_slots=%u\n",
                 __func__, tbl->used_slots[0], tbl->highest_used_slotid,
                 tbl->max_slots);
         slotid = find_first_zero_bit(tbl->used_slots, tbl->max_slots);
         if (slotid >= tbl->max_slots)
                 goto out;
         __set_bit(slotid, tbl->used_slots);
-       if (slotid > tbl->highest_used_slotid)
+       if (slotid > tbl->highest_used_slotid ||
+                       tbl->highest_used_slotid == NFS4_NO_SLOT)
                 tbl->highest_used_slotid = slotid;
         ret_id = slotid;
  out:
@@ -534,15 +551,25 @@ out:
         return ret_id;
  }
  
+static void nfs41_init_sequence(struct nfs4_sequence_args *args,
+               struct nfs4_sequence_res *res, int cache_reply)
+{
+       args->sa_session = NULL;
+       args->sa_cache_this = 0;
+       if (cache_reply)
+               args->sa_cache_this = 1;
+       res->sr_session = NULL;
+       res->sr_slot = NULL;
+}
+
  int nfs41_setup_sequence(struct nfs4_session *session,
                                 struct nfs4_sequence_args *args,
                                 struct nfs4_sequence_res *res,
-                               int cache_reply,
                                 struct rpc_task *task)
  {
         struct nfs4_slot *slot;
         struct nfs4_slot_table *tbl;
-       u8 slotid;
+       u32 slotid;
  
         dprintk("--> %s\n", __func__);
         /* slot already allocated? */
@@ -570,7 +597,7 @@ int nfs41_setup_sequence(struct nfs4_session *session,
         }
  
         slotid = nfs4_find_slot(tbl);
-       if (slotid == NFS4_MAX_SLOT_TABLE) {
+       if (slotid == NFS4_NO_SLOT) {
                 rpc_sleep_on(&tbl->slot_tbl_waitq, task, NULL);
                 spin_unlock(&tbl->slot_tbl_lock);
                 dprintk("<-- %s: no free slots\n", __func__);
@@ -582,7 +609,6 @@ int nfs41_setup_sequence(struct nfs4_session *session,
         slot = tbl->slots + slotid;
         args->sa_session = session;
         args->sa_slotid = slotid;
-       args->sa_cache_this = cache_reply;
  
         dprintk("<-- %s slotid=%d seqid=%d\n", __func__, slotid, slot->seq_nr);
  
@@ -602,24 +628,19 @@ EXPORT_SYMBOL_GPL(nfs41_setup_sequence);
  int nfs4_setup_sequence(const struct nfs_server *server,
                         struct nfs4_sequence_args *args,
                         struct nfs4_sequence_res *res,
-                       int cache_reply,
                         struct rpc_task *task)
  {
         struct nfs4_session *session = nfs4_get_session(server);
         int ret = 0;
  
-       if (session == NULL) {
-               args->sa_session = NULL;
-               res->sr_session = NULL;
+       if (session == NULL)
                 goto out;
-       }
  
         dprintk("--> %s clp %p session %p sr_slot %td\n",
                 __func__, session->clp, session, res->sr_slot ?
                         res->sr_slot - session->fc_slot_table.slots : -1);
  
-       ret = nfs41_setup_sequence(session, args, res, cache_reply,
-                                  task);
+       ret = nfs41_setup_sequence(session, args, res, task);
  out:
         dprintk("<-- %s status=%d\n", __func__, ret);
         return ret;
@@ -629,7 +650,6 @@ struct nfs41_call_sync_data {
         const struct nfs_server *seq_server;
         struct nfs4_sequence_args *seq_args;
         struct nfs4_sequence_res *seq_res;
-       int cache_reply;
  };
  
  static void nfs41_call_sync_prepare(struct rpc_task *task, void *calldata)
@@ -639,7 +659,7 @@ static void nfs41_call_sync_prepare(struct rpc_task *task, void *calldata)
         dprintk("--> %s data->seq_server %p\n", __func__, data->seq_server);
  
         if (nfs4_setup_sequence(data->seq_server, data->seq_args,
-                               data->seq_res, data->cache_reply, task))
+                               data->seq_res, task))
                 return;
         rpc_call_start(task);
  }
@@ -657,12 +677,12 @@ static void nfs41_call_sync_done(struct rpc_task *task, void *calldata)
         nfs41_sequence_done(task, data->seq_res);
  }
  
-struct rpc_call_ops nfs41_call_sync_ops = {
+static const struct rpc_call_ops nfs41_call_sync_ops = {
         .rpc_call_prepare = nfs41_call_sync_prepare,
         .rpc_call_done = nfs41_call_sync_done,
  };
  
-struct rpc_call_ops nfs41_call_priv_sync_ops = {
+static const struct rpc_call_ops nfs41_call_priv_sync_ops = {
         .rpc_call_prepare = nfs41_call_priv_sync_prepare,
         .rpc_call_done = nfs41_call_sync_done,
  };
@@ -672,7 +692,6 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
                                    struct rpc_message *msg,
                                    struct nfs4_sequence_args *args,
                                    struct nfs4_sequence_res *res,
-                                  int cache_reply,
                                    int privileged)
  {
         int ret;
@@ -681,7 +700,6 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
                 .seq_server = server,
                 .seq_args = args,
                 .seq_res = res,
-               .cache_reply = cache_reply,
         };
         struct rpc_task_setup task_setup = {
                 .rpc_client = clnt,
@@ -690,7 +708,6 @@ static int nfs4_call_sync_sequence(struct rpc_clnt *clnt,
                 .callback_data = &data
         };
  
-       res->sr_slot = NULL;
         if (privileged)
                 task_setup.callback_ops = &nfs41_call_priv_sync_ops;
         task = rpc_run_task(&task_setup);
@@ -710,10 +727,17 @@ int _nfs4_call_sync_session(struct rpc_clnt *clnt,
                             struct nfs4_sequence_res *res,
                             int cache_reply)
  {
-       return nfs4_call_sync_sequence(clnt, server, msg, args, res, cache_reply, 0);
+       nfs41_init_sequence(args, res, cache_reply);
+       return nfs4_call_sync_sequence(clnt, server, msg, args, res, 0);
  }
  
  #else
+static inline
+void nfs41_init_sequence(struct nfs4_sequence_args *args,
+               struct nfs4_sequence_res *res, int cache_reply)
+{
+}
+
  static int nfs4_sequence_done(struct rpc_task *task,
                                struct nfs4_sequence_res *res)
  {
@@ -728,7 +752,7 @@ int _nfs4_call_sync(struct rpc_clnt *clnt,
                     struct nfs4_sequence_res *res,
                     int cache_reply)
  {
-       args->sa_session = res->sr_session = NULL;
+       nfs41_init_sequence(args, res, cache_reply);
         return rpc_call_sync(clnt, msg, 0);
  }
  
@@ -815,20 +839,22 @@ static struct nfs4_opendata *nfs4_opendata_alloc(struct dentry *dentry,
         p->o_arg.open_flags = flags;
         p->o_arg.fmode = fmode & (FMODE_READ|FMODE_WRITE);
         p->o_arg.clientid = server->nfs_client->cl_clientid;
-       p->o_arg.id = sp->so_owner_id.id;
+       p->o_arg.id = sp->so_seqid.owner_id;
         p->o_arg.name = &dentry->d_name;
         p->o_arg.server = server;
         p->o_arg.bitmask = server->attr_bitmask;
         p->o_arg.dir_bitmask = server->cache_consistency_bitmask;
         p->o_arg.claim = NFS4_OPEN_CLAIM_NULL;
-       if (flags & O_CREAT) {
-               u32 *s;
+       if (attrs != NULL && attrs->ia_valid != 0) {
+               __be32 verf[2];
  
                 p->o_arg.u.attrs = &p->attrs;
                 memcpy(&p->attrs, attrs, sizeof(p->attrs));
-               s = (u32 *) p->o_arg.u.verifier.data;
-               s[0] = jiffies;
-               s[1] = current->pid;
+
+               verf[0] = jiffies;
+               verf[1] = current->pid;
+               memcpy(p->o_arg.u.verifier.data, verf,
+                               sizeof(p->o_arg.u.verifier.data));
         }
         p->c_arg.fh = &p->o_res.fh;
         p->c_arg.stateid = &p->o_res.stateid;
@@ -878,7 +904,7 @@ static int can_open_cached(struct nfs4_state *state, fmode_t mode, int open_mode
  {
         int ret = 0;
  
-       if (open_mode & O_EXCL)
+       if (open_mode & (O_EXCL|O_TRUNC))
                 goto out;
         switch (mode & (FMODE_READ|FMODE_WRITE)) {
                 case FMODE_READ:
@@ -927,8 +953,8 @@ static void update_open_stateflags(struct nfs4_state *state, fmode_t fmode)
  static void nfs_set_open_stateid_locked(struct nfs4_state *state, nfs4_stateid *stateid, fmode_t fmode)
  {
         if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
-               memcpy(state->stateid.data, stateid->data, sizeof(state->stateid.data));
-       memcpy(state->open_stateid.data, stateid->data, sizeof(state->open_stateid.data));
+               nfs4_stateid_copy(&state->stateid, stateid);
+       nfs4_stateid_copy(&state->open_stateid, stateid);
         switch (fmode) {
                 case FMODE_READ:
                         set_bit(NFS_O_RDONLY_STATE, &state->flags);
@@ -956,7 +982,7 @@ static void __update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_s
          */
         write_seqlock(&state->seqlock);
         if (deleg_stateid != NULL) {
-               memcpy(state->stateid.data, deleg_stateid->data, sizeof(state->stateid.data));
+               nfs4_stateid_copy(&state->stateid, deleg_stateid);
                 set_bit(NFS_DELEGATED_STATE, &state->flags);
         }
         if (open_stateid != NULL)
@@ -987,7 +1013,7 @@ static int update_open_stateid(struct nfs4_state *state, nfs4_stateid *open_stat
  
         if (delegation == NULL)
                 delegation = &deleg_cur->stateid;
-       else if (memcmp(deleg_cur->stateid.data, delegation->data, NFS4_STATEID_SIZE) != 0)
+       else if (!nfs4_stateid_match(&deleg_cur->stateid, delegation))
                 goto no_delegation_unlock;
  
         nfs_mark_delegation_referenced(deleg_cur);
@@ -1026,7 +1052,7 @@ static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata)
         struct nfs4_state *state = opendata->state;
         struct nfs_inode *nfsi = NFS_I(state->inode);
         struct nfs_delegation *delegation;
-       int open_mode = opendata->o_arg.open_flags & O_EXCL;
+       int open_mode = opendata->o_arg.open_flags & (O_EXCL|O_TRUNC);
         fmode_t fmode = opendata->o_arg.fmode;
         nfs4_stateid stateid;
         int ret = -EAGAIN;
@@ -1048,7 +1074,7 @@ static struct nfs4_state *nfs4_try_open_cached(struct nfs4_opendata *opendata)
                         break;
                 }
                 /* Save the delegation */
-               memcpy(stateid.data, delegation->stateid.data, sizeof(stateid.data));
+               nfs4_stateid_copy(&stateid, &delegation->stateid);
                 rcu_read_unlock();
                 ret = nfs_may_open(state->inode, state->owner->so_cred, open_mode);
                 if (ret != 0)
@@ -1090,6 +1116,7 @@ static struct nfs4_state *nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data
         if (state == NULL)
                 goto err_put_inode;
         if (data->o_res.delegation_type != 0) {
+               struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
                 int delegation_flags = 0;
  
                 rcu_read_lock();
@@ -1101,7 +1128,7 @@ static struct nfs4_state *nfs4_opendata_to_nfs4_state(struct nfs4_opendata *data
                         pr_err_ratelimited("NFS: Broken NFSv4 server %s is "
                                         "returning a delegation for "
                                         "OPEN(CLAIM_DELEGATE_CUR)\n",
-                                       NFS_CLIENT(inode)->cl_server);
+                                       clp->cl_hostname);
                 } else if ((delegation_flags & 1UL<<NFS_DELEGATION_NEED_RECLAIM) == 0)
                         nfs_inode_set_delegation(state->inode,
                                         data->owner->so_cred,
@@ -1210,10 +1237,10 @@ static int nfs4_open_recover(struct nfs4_opendata *opendata, struct nfs4_state *
          * Check if we need to update the current stateid.
          */
         if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0 &&
-           memcmp(state->stateid.data, state->open_stateid.data, sizeof(state->stateid.data)) != 0) {
+           !nfs4_stateid_match(&state->stateid, &state->open_stateid)) {
                 write_seqlock(&state->seqlock);
                 if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
-                       memcpy(state->stateid.data, state->open_stateid.data, sizeof(state->stateid.data));
+                       nfs4_stateid_copy(&state->stateid, &state->open_stateid);
                 write_sequnlock(&state->seqlock);
         }
         return 0;
@@ -1282,8 +1309,7 @@ static int _nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs
         if (IS_ERR(opendata))
                 return PTR_ERR(opendata);
         opendata->o_arg.claim = NFS4_OPEN_CLAIM_DELEGATE_CUR;
-       memcpy(opendata->o_arg.u.delegation.data, stateid->data,
-                       sizeof(opendata->o_arg.u.delegation.data));
+       nfs4_stateid_copy(&opendata->o_arg.u.delegation, stateid);
         ret = nfs4_open_recover(opendata, state);
         nfs4_opendata_put(opendata);
         return ret;
@@ -1319,8 +1345,11 @@ int nfs4_open_delegation_recall(struct nfs_open_context *ctx, struct nfs4_state
                                  * The show must go on: exit, but mark the
                                  * stateid as needing recovery.
                                  */
+                       case -NFS4ERR_DELEG_REVOKED:
                         case -NFS4ERR_ADMIN_REVOKED:
                         case -NFS4ERR_BAD_STATEID:
+                               nfs_inode_find_state_and_recover(state->inode,
+                                               stateid);
                                 nfs4_schedule_stateid_recovery(server, state);
                         case -EKEYEXPIRED:
                                 /*
@@ -1345,8 +1374,7 @@ static void nfs4_open_confirm_done(struct rpc_task *task, void *calldata)
  
         data->rpc_status = task->tk_status;
         if (data->rpc_status == 0) {
-               memcpy(data->o_res.stateid.data, data->c_res.stateid.data,
-                               sizeof(data->o_res.stateid.data));
+               nfs4_stateid_copy(&data->o_res.stateid, &data->c_res.stateid);
                 nfs_confirm_seqid(&data->owner->so_seqid, 0);
                 renew_lease(data->o_res.server, data->timestamp);
                 data->rpc_done = 1;
@@ -1440,7 +1468,7 @@ static void nfs4_open_prepare(struct rpc_task *task, void *calldata)
                 rcu_read_unlock();
         }
         /* Update sequence id. */
-       data->o_arg.id = sp->so_owner_id.id;
+       data->o_arg.id = sp->so_seqid.owner_id;
         data->o_arg.clientid = sp->so_server->nfs_client->cl_clientid;
         if (data->o_arg.claim == NFS4_OPEN_CLAIM_PREVIOUS) {
                 task->tk_msg.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OPEN_NOATTR];
@@ -1449,7 +1477,7 @@ static void nfs4_open_prepare(struct rpc_task *task, void *calldata)
         data->timestamp = jiffies;
         if (nfs4_setup_sequence(data->o_arg.server,
                                 &data->o_arg.seq_args,
-                               &data->o_res.seq_res, 1, task))
+                               &data->o_res.seq_res, task))
                 return;
         rpc_call_start(task);
         return;
@@ -1551,6 +1579,7 @@ static int nfs4_run_open_task(struct nfs4_opendata *data, int isrecover)
         };
         int status;
  
+       nfs41_init_sequence(&o_arg->seq_args, &o_res->seq_res, 1);
         kref_get(&data->kref);
         data->rpc_done = 0;
         data->rpc_status = 0;
@@ -1712,15 +1741,32 @@ static int nfs4_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *sta
  }
  
  #if defined(CONFIG_NFS_V4_1)
-static int nfs41_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state)
+static int nfs41_check_expired_stateid(struct nfs4_state *state, nfs4_stateid *stateid, unsigned int flags)
  {
-       int status;
+       int status = NFS_OK;
         struct nfs_server *server = NFS_SERVER(state->inode);
  
-       status = nfs41_test_stateid(server, state);
-       if (status == NFS_OK)
-               return 0;
-       nfs41_free_stateid(server, state);
+       if (state->flags & flags) {
+               status = nfs41_test_stateid(server, stateid);
+               if (status != NFS_OK) {
+                       nfs41_free_stateid(server, stateid);
+                       state->flags &= ~flags;
+               }
+       }
+       return status;
+}
+
+static int nfs41_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state)
+{
+       int deleg_status, open_status;
+       int deleg_flags = 1 << NFS_DELEGATED_STATE;
+       int open_flags = (1 << NFS_O_RDONLY_STATE) | (1 << NFS_O_WRONLY_STATE) | (1 << NFS_O_RDWR_STATE);
+
+       deleg_status = nfs41_check_expired_stateid(state, &state->stateid, deleg_flags);
+       open_status = nfs41_check_expired_stateid(state,  &state->open_stateid, open_flags);
+
+       if ((deleg_status == NFS_OK) && (open_status == NFS_OK))
+               return NFS_OK;
         return nfs4_open_expired(sp, state);
  }
  #endif
@@ -1754,7 +1800,8 @@ static int _nfs4_do_open(struct inode *dir, struct dentry *dentry, fmode_t fmode
  
         /* Protect against reboot recovery conflicts */
         status = -ENOMEM;
-       if (!(sp = nfs4_get_state_owner(server, cred))) {
+       sp = nfs4_get_state_owner(server, cred, GFP_KERNEL);
+       if (sp == NULL) {
                 dprintk("nfs4_do_open: nfs4_get_state_owner failed!\n");
                 goto out_err;
         }
@@ -1829,7 +1876,7 @@ static struct nfs4_state *nfs4_do_open(struct inode *dir, struct dentry *dentry,
                  * the user though...
                  */
                 if (status == -NFS4ERR_BAD_SEQID) {
-                       printk(KERN_WARNING "NFS: v4 server %s "
+                       pr_warn_ratelimited("NFS: v4 server %s "
                                         " returned a bad sequence-id error!\n",
                                         NFS_SERVER(dir)->nfs_client->cl_hostname);
                         exception.retry = 1;
@@ -1882,12 +1929,14 @@ static int _nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
  
         nfs_fattr_init(fattr);
  
-       if (nfs4_copy_delegation_stateid(&arg.stateid, inode)) {
+       if (state != NULL) {
+               nfs4_select_rw_stateid(&arg.stateid, state, FMODE_WRITE,
+                               current->files, current->tgid);
+       } else if (nfs4_copy_delegation_stateid(&arg.stateid, inode,
+                               FMODE_WRITE)) {
                 /* Use that stateid */
-       } else if (state != NULL) {
-               nfs4_copy_stateid(&arg.stateid, state, current->files, current->tgid);
         } else
-               memcpy(&arg.stateid, &zero_stateid, sizeof(arg.stateid));
+               nfs4_stateid_copy(&arg.stateid, &zero_stateid);
  
         status = nfs4_call_sync(server->client, server, &msg, &arg.seq_args, &res.seq_res, 1);
         if (status == 0 && state != NULL)
@@ -1900,7 +1949,10 @@ static int nfs4_do_setattr(struct inode *inode, struct rpc_cred *cred,
                            struct nfs4_state *state)
  {
         struct nfs_server *server = NFS_SERVER(inode);
-       struct nfs4_exception exception = { };
+       struct nfs4_exception exception = {
+               .state = state,
+               .inode = inode,
+       };
         int err;
         do {
                 err = nfs4_handle_exception(server,
@@ -1954,6 +2006,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
         struct nfs4_state *state = calldata->state;
         struct nfs_server *server = NFS_SERVER(calldata->inode);
  
+       dprintk("%s: begin!\n", __func__);
         if (!nfs4_sequence_done(task, &calldata->res.seq_res))
                 return;
          /* hmm. we are done with the inode, and in the process of freeing
@@ -1981,6 +2034,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data)
         }
         nfs_release_seqid(calldata->arg.seqid);
         nfs_refresh_inode(calldata->inode, calldata->res.fattr);
+       dprintk("%s: done, ret = %d!\n", __func__, task->tk_status);
  }
  
  static void nfs4_close_prepare(struct rpc_task *task, void *data)
@@ -1989,6 +2043,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
         struct nfs4_state *state = calldata->state;
         int call_close = 0;
  
+       dprintk("%s: begin!\n", __func__);
         if (nfs_wait_on_sequence(calldata->arg.seqid, task) != 0)
                 return;
  
@@ -2013,7 +2068,7 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
         if (!call_close) {
                 /* Note: exit _without_ calling nfs4_close_done */
                 task->tk_action = NULL;
-               return;
+               goto out;
         }
  
         if (calldata->arg.fmode == 0) {
@@ -2022,17 +2077,20 @@ static void nfs4_close_prepare(struct rpc_task *task, void *data)
                     pnfs_roc_drain(calldata->inode, &calldata->roc_barrier)) {
                         rpc_sleep_on(&NFS_SERVER(calldata->inode)->roc_rpcwaitq,
                                      task, NULL);
-                       return;
+                       goto out;
                 }
         }
  
         nfs_fattr_init(calldata->res.fattr);
         calldata->timestamp = jiffies;
         if (nfs4_setup_sequence(NFS_SERVER(calldata->inode),
-                               &calldata->arg.seq_args, &calldata->res.seq_res,
-                               1, task))
-               return;
+                               &calldata->arg.seq_args,
+                               &calldata->res.seq_res,
+                               task))
+               goto out;
         rpc_call_start(task);
+out:
+       dprintk("%s: done!\n", __func__);
  }
  
  static const struct rpc_call_ops nfs4_close_ops = {
@@ -2074,6 +2132,7 @@ int nfs4_do_close(struct nfs4_state *state, gfp_t gfp_mask, int wait, bool roc)
         calldata = kzalloc(sizeof(*calldata), gfp_mask);
         if (calldata == NULL)
                 goto out;
+       nfs41_init_sequence(&calldata->arg.seq_args, &calldata->res.seq_res, 1);
         calldata->inode = state->inode;
         calldata->state = state;
         calldata->arg.fh = NFS_FH(state->inode);
@@ -2182,6 +2241,7 @@ static int _nfs4_server_capabilities(struct nfs_server *server, struct nfs_fh *f
                 server->cache_consistency_bitmask[0] &= FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE;
                 server->cache_consistency_bitmask[1] &= FATTR4_WORD1_TIME_METADATA|FATTR4_WORD1_TIME_MODIFY;
                 server->acl_bitmask = res.acl_bitmask;
+               server->fh_expire_type = res.fh_expire_type;
         }
  
         return status;
@@ -2303,7 +2363,6 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
         return nfs4_map_errors(status);
  }
  
-static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr);
  /*
   * Get locations and (maybe) other attributes of a referral.
   * Note that we'll actually follow the referral later when
@@ -2420,6 +2479,10 @@ nfs4_proc_setattr(struct dentry *dentry, struct nfs_fattr *fattr,
                 }
         }
  
+       /* Deal with open(O_TRUNC) */
+       if (sattr->ia_valid & ATTR_OPEN)
+               sattr->ia_valid &= ~(ATTR_MTIME|ATTR_CTIME|ATTR_OPEN);
+
         status = nfs4_do_setattr(inode, cred, fattr, sattr, state);
         if (status == 0)
                 nfs_setattr_update_inode(inode, sattr);
@@ -2494,7 +2557,7 @@ static int _nfs4_proc_access(struct inode *inode, struct nfs_access_entry *entry
         struct nfs_server *server = NFS_SERVER(inode);
         struct nfs4_accessargs args = {
                 .fh = NFS_FH(inode),
-               .bitmask = server->attr_bitmask,
+               .bitmask = server->cache_consistency_bitmask,
         };
         struct nfs4_accessres res = {
                 .server = server,
@@ -2712,8 +2775,18 @@ static void nfs4_proc_unlink_setup(struct rpc_message *msg, struct inode *dir)
  
         args->bitmask = server->cache_consistency_bitmask;
         res->server = server;
-       res->seq_res.sr_slot = NULL;
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_REMOVE];
+       nfs41_init_sequence(&args->seq_args, &res->seq_res, 1);
+}
+
+static void nfs4_proc_unlink_rpc_prepare(struct rpc_task *task, struct nfs_unlinkdata *data)
+{
+       if (nfs4_setup_sequence(NFS_SERVER(data->dir),
+                               &data->args.seq_args,
+                               &data->res.seq_res,
+                               task))
+               return;
+       rpc_call_start(task);
  }
  
  static int nfs4_proc_unlink_done(struct rpc_task *task, struct inode *dir)
@@ -2738,6 +2811,17 @@ static void nfs4_proc_rename_setup(struct rpc_message *msg, struct inode *dir)
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENAME];
         arg->bitmask = server->attr_bitmask;
         res->server = server;
+       nfs41_init_sequence(&arg->seq_args, &res->seq_res, 1);
+}
+
+static void nfs4_proc_rename_rpc_prepare(struct rpc_task *task, struct nfs_renamedata *data)
+{
+       if (nfs4_setup_sequence(NFS_SERVER(data->old_dir),
+                               &data->args.seq_args,
+                               &data->res.seq_res,
+                               task))
+               return;
+       rpc_call_start(task);
  }
  
  static int nfs4_proc_rename_done(struct rpc_task *task, struct inode *old_dir,
@@ -3232,6 +3316,17 @@ static void nfs4_proc_read_setup(struct nfs_read_data *data, struct rpc_message
         data->timestamp   = jiffies;
         data->read_done_cb = nfs4_read_done_cb;
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
+       nfs41_init_sequence(&data->args.seq_args, &data->res.seq_res, 0);
+}
+
+static void nfs4_proc_read_rpc_prepare(struct rpc_task *task, struct nfs_read_data *data)
+{
+       if (nfs4_setup_sequence(NFS_SERVER(data->inode),
+                               &data->args.seq_args,
+                               &data->res.seq_res,
+                               task))
+               return;
+       rpc_call_start(task);
  }
  
  /* Reset the the nfs_read_data to send the read to the MDS. */
@@ -3305,6 +3400,17 @@ static void nfs4_proc_write_setup(struct nfs_write_data *data, struct rpc_messag
         data->timestamp   = jiffies;
  
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_WRITE];
+       nfs41_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
+}
+
+static void nfs4_proc_write_rpc_prepare(struct rpc_task *task, struct nfs_write_data *data)
+{
+       if (nfs4_setup_sequence(NFS_SERVER(data->inode),
+                               &data->args.seq_args,
+                               &data->res.seq_res,
+                               task))
+               return;
+       rpc_call_start(task);
  }
  
  static int nfs4_commit_done_cb(struct rpc_task *task, struct nfs_write_data *data)
@@ -3339,6 +3445,7 @@ static void nfs4_proc_commit_setup(struct nfs_write_data *data, struct rpc_messa
                 data->write_done_cb = nfs4_commit_done_cb;
         data->res.server = server;
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COMMIT];
+       nfs41_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
  }
  
  struct nfs4_renewdata {
@@ -3714,8 +3821,11 @@ nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server,
         if (task->tk_status >= 0)
                 return 0;
         switch(task->tk_status) {
+               case -NFS4ERR_DELEG_REVOKED:
                 case -NFS4ERR_ADMIN_REVOKED:
                 case -NFS4ERR_BAD_STATEID:
+                       if (state != NULL)
+                               nfs_remove_bad_delegation(state->inode);
                 case -NFS4ERR_OPENMODE:
                         if (state == NULL)
                                 break;
@@ -3764,6 +3874,16 @@ wait_on_recovery:
         return -EAGAIN;
  }
  
+static void nfs4_construct_boot_verifier(struct nfs_client *clp,
+                                        nfs4_verifier *bootverf)
+{
+       __be32 verf[2];
+
+       verf[0] = htonl((u32)clp->cl_boot_time.tv_sec);
+       verf[1] = htonl((u32)clp->cl_boot_time.tv_nsec);
+       memcpy(bootverf->data, verf, sizeof(bootverf->data));
+}
+
  int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
                 unsigned short port, struct rpc_cred *cred,
                 struct nfs4_setclientid_res *res)
@@ -3780,15 +3900,13 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
                 .rpc_resp = res,
                 .rpc_cred = cred,
         };
-       __be32 *p;
         int loop = 0;
         int status;
  
-       p = (__be32*)sc_verifier.data;
-       *p++ = htonl((u32)clp->cl_boot_time.tv_sec);
-       *p = htonl((u32)clp->cl_boot_time.tv_nsec);
+       nfs4_construct_boot_verifier(clp, &sc_verifier);
  
         for(;;) {
+               rcu_read_lock();
                 setclientid.sc_name_len = scnprintf(setclientid.sc_name,
                                 sizeof(setclientid.sc_name), "%s/%s %s %s %u",
                                 clp->cl_ipaddr,
@@ -3805,6 +3923,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program,
                 setclientid.sc_uaddr_len = scnprintf(setclientid.sc_uaddr,
                                 sizeof(setclientid.sc_uaddr), "%s.%u.%u",
                                 clp->cl_ipaddr, port >> 8, port & 255);
+               rcu_read_unlock();
  
                 status = rpc_call_sync(clp->cl_rpcclient, &msg, RPC_TASK_TIMEOUT);
                 if (status != -NFS4ERR_CLID_INUSE)
@@ -3891,7 +4010,7 @@ static void nfs4_delegreturn_prepare(struct rpc_task *task, void *data)
  
         if (nfs4_setup_sequence(d_data->res.server,
                                 &d_data->args.seq_args,
-                               &d_data->res.seq_res, 1, task))
+                               &d_data->res.seq_res, task))
                 return;
         rpc_call_start(task);
  }
@@ -3925,11 +4044,12 @@ static int _nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, co
         data = kzalloc(sizeof(*data), GFP_NOFS);
         if (data == NULL)
                 return -ENOMEM;
+       nfs41_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
         data->args.fhandle = &data->fh;
         data->args.stateid = &data->stateid;
         data->args.bitmask = server->attr_bitmask;
         nfs_copy_fh(&data->fh, NFS_FH(inode));
-       memcpy(&data->stateid, stateid, sizeof(data->stateid));
+       nfs4_stateid_copy(&data->stateid, stateid);
         data->res.fattr = &data->fattr;
         data->res.server = server;
         nfs_fattr_init(data->res.fattr);
@@ -4016,7 +4136,7 @@ static int _nfs4_proc_getlk(struct nfs4_state *state, int cmd, struct file_lock
         if (status != 0)
                 goto out;
         lsp = request->fl_u.nfs4_fl.owner;
-       arg.lock_owner.id = lsp->ls_id.id;
+       arg.lock_owner.id = lsp->ls_seqid.owner_id;
         arg.lock_owner.s_dev = server->s_dev;
         status = nfs4_call_sync(server->client, server, &msg, &arg.seq_args, &res.seq_res, 1);
         switch (status) {
@@ -4112,9 +4232,8 @@ static void nfs4_locku_done(struct rpc_task *task, void *data)
                 return;
         switch (task->tk_status) {
                 case 0:
-                       memcpy(calldata->lsp->ls_stateid.data,
-                                       calldata->res.stateid.data,
-                                       sizeof(calldata->lsp->ls_stateid.data));
+                       nfs4_stateid_copy(&calldata->lsp->ls_stateid,
+                                       &calldata->res.stateid);
                         renew_lease(calldata->server, calldata->timestamp);
                         break;
                 case -NFS4ERR_BAD_STATEID:
@@ -4142,7 +4261,7 @@ static void nfs4_locku_prepare(struct rpc_task *task, void *data)
         calldata->timestamp = jiffies;
         if (nfs4_setup_sequence(calldata->server,
                                 &calldata->arg.seq_args,
-                               &calldata->res.seq_res, 1, task))
+                               &calldata->res.seq_res, task))
                 return;
         rpc_call_start(task);
  }
@@ -4182,6 +4301,7 @@ static struct rpc_task *nfs4_do_unlck(struct file_lock *fl,
                 return ERR_PTR(-ENOMEM);
         }
  
+       nfs41_init_sequence(&data->arg.seq_args, &data->res.seq_res, 1);
         msg.rpc_argp = &data->arg;
         msg.rpc_resp = &data->res;
         task_setup_data.callback_data = data;
@@ -4261,7 +4381,7 @@ static struct nfs4_lockdata *nfs4_alloc_lockdata(struct file_lock *fl,
                 goto out_free_seqid;
         p->arg.lock_stateid = &lsp->ls_stateid;
         p->arg.lock_owner.clientid = server->nfs_client->cl_clientid;
-       p->arg.lock_owner.id = lsp->ls_id.id;
+       p->arg.lock_owner.id = lsp->ls_seqid.owner_id;
         p->arg.lock_owner.s_dev = server->s_dev;
         p->res.lock_seqid = p->arg.lock_seqid;
         p->lsp = lsp;
@@ -4297,7 +4417,7 @@ static void nfs4_lock_prepare(struct rpc_task *task, void *calldata)
         data->timestamp = jiffies;
         if (nfs4_setup_sequence(data->server,
                                 &data->arg.seq_args,
-                               &data->res.seq_res, 1, task))
+                               &data->res.seq_res, task))
                 return;
         rpc_call_start(task);
         dprintk("%s: done!, ret = %d\n", __func__, data->rpc_status);
@@ -4326,8 +4446,7 @@ static void nfs4_lock_done(struct rpc_task *task, void *calldata)
                         goto out;
         }
         if (data->rpc_status == 0) {
-               memcpy(data->lsp->ls_stateid.data, data->res.stateid.data,
-                                       sizeof(data->lsp->ls_stateid.data));
+               nfs4_stateid_copy(&data->lsp->ls_stateid, &data->res.stateid);
                 data->lsp->ls_flags |= NFS_LOCK_INITIALIZED;
                 renew_lease(NFS_SERVER(data->ctx->dentry->d_inode), data->timestamp);
         }
@@ -4415,6 +4534,7 @@ static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *f
                         data->arg.reclaim = NFS_LOCK_RECLAIM;
                 task_setup_data.callback_ops = &nfs4_recover_lock_ops;
         }
+       nfs41_init_sequence(&data->arg.seq_args, &data->res.seq_res, 1);
         msg.rpc_argp = &data->arg;
         msg.rpc_resp = &data->res;
         task_setup_data.callback_data = data;
@@ -4479,15 +4599,34 @@ out:
  }
  
  #if defined(CONFIG_NFS_V4_1)
-static int nfs41_lock_expired(struct nfs4_state *state, struct file_lock *request)
+static int nfs41_check_expired_locks(struct nfs4_state *state)
  {
-       int status;
+       int status, ret = NFS_OK;
+       struct nfs4_lock_state *lsp;
         struct nfs_server *server = NFS_SERVER(state->inode);
  
-       status = nfs41_test_stateid(server, state);
+       list_for_each_entry(lsp, &state->lock_states, ls_locks) {
+               if (lsp->ls_flags & NFS_LOCK_INITIALIZED) {
+                       status = nfs41_test_stateid(server, &lsp->ls_stateid);
+                       if (status != NFS_OK) {
+                               nfs41_free_stateid(server, &lsp->ls_stateid);
+                               lsp->ls_flags &= ~NFS_LOCK_INITIALIZED;
+                               ret = status;
+                       }
+               }
+       };
+
+       return ret;
+}
+
+static int nfs41_lock_expired(struct nfs4_state *state, struct file_lock *request)
+{
+       int status = NFS_OK;
+
+       if (test_bit(LK_STATE_IN_USE, &state->flags))
+               status = nfs41_check_expired_locks(state);
         if (status == NFS_OK)
-               return 0;
-       nfs41_free_stateid(server, state);
+               return status;
         return nfs4_lock_expired(state, request);
  }
  #endif
@@ -4523,7 +4662,8 @@ static int _nfs4_proc_setlk(struct nfs4_state *state, int cmd, struct file_lock
         /* Note: we always want to sleep here! */
         request->fl_flags = fl_flags | FL_SLEEP;
         if (do_vfs_lock(request->fl_file, request) < 0)
-               printk(KERN_WARNING "%s: VFS is out of sync with lock manager!\n", __func__);
+               printk(KERN_WARNING "NFS: %s: VFS is out of sync with lock "
+                       "manager!\n", __func__);
  out_unlock:
         up_read(&nfsi->rwsem);
  out:
@@ -4533,7 +4673,9 @@ out:
  
  static int nfs4_proc_setlk(struct nfs4_state *state, int cmd, struct file_lock *request)
  {
-       struct nfs4_exception exception = { };
+       struct nfs4_exception exception = {
+               .state = state,
+       };
         int err;
  
         do {
@@ -4603,8 +4745,8 @@ int nfs4_lock_delegation_recall(struct nfs4_state *state, struct file_lock *fl)
                 err = _nfs4_do_setlk(state, F_SETLK, fl, NFS_LOCK_NEW);
                 switch (err) {
                         default:
-                               printk(KERN_ERR "%s: unhandled error %d.\n",
-                                               __func__, err);
+                               printk(KERN_ERR "NFS: %s: unhandled error "
+                                       "%d.\n", __func__, err);
                         case 0:
                         case -ESTALE:
                                 goto out;
@@ -4626,6 +4768,7 @@ int nfs4_lock_delegation_recall(struct nfs4_state *state, struct file_lock *fl)
                                  * The show must go on: exit, but mark the
                                  * stateid as needing recovery.
                                  */
+                       case -NFS4ERR_DELEG_REVOKED:
                         case -NFS4ERR_ADMIN_REVOKED:
                         case -NFS4ERR_BAD_STATEID:
                         case -NFS4ERR_OPENMODE:
@@ -4655,33 +4798,44 @@ out:
         return err;
  }
  
+struct nfs_release_lockowner_data {
+       struct nfs4_lock_state *lsp;
+       struct nfs_server *server;
+       struct nfs_release_lockowner_args args;
+};
+
  static void nfs4_release_lockowner_release(void *calldata)
  {
+       struct nfs_release_lockowner_data *data = calldata;
+       nfs4_free_lock_state(data->server, data->lsp);
         kfree(calldata);
  }
  
-const struct rpc_call_ops nfs4_release_lockowner_ops = {
+static const struct rpc_call_ops nfs4_release_lockowner_ops = {
         .rpc_release = nfs4_release_lockowner_release,
  };
  
-void nfs4_release_lockowner(const struct nfs4_lock_state *lsp)
+int nfs4_release_lockowner(struct nfs4_lock_state *lsp)
  {
         struct nfs_server *server = lsp->ls_state->owner->so_server;
-       struct nfs_release_lockowner_args *args;
+       struct nfs_release_lockowner_data *data;
         struct rpc_message msg = {
                 .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RELEASE_LOCKOWNER],
         };
  
         if (server->nfs_client->cl_mvops->minor_version != 0)
-               return;
-       args = kmalloc(sizeof(*args), GFP_NOFS);
-       if (!args)
-               return;
-       args->lock_owner.clientid = server->nfs_client->cl_clientid;
-       args->lock_owner.id = lsp->ls_id.id;
-       args->lock_owner.s_dev = server->s_dev;
-       msg.rpc_argp = args;
-       rpc_call_async(server->client, &msg, 0, &nfs4_release_lockowner_ops, args);
+               return -EINVAL;
+       data = kmalloc(sizeof(*data), GFP_NOFS);
+       if (!data)
+               return -ENOMEM;
+       data->lsp = lsp;
+       data->server = server;
+       data->args.lock_owner.clientid = server->nfs_client->cl_clientid;
+       data->args.lock_owner.id = lsp->ls_seqid.owner_id;
+       data->args.lock_owner.s_dev = server->s_dev;
+       msg.rpc_argp = &data->args;
+       rpc_call_async(server->client, &msg, 0, &nfs4_release_lockowner_ops, data);
+       return 0;
  }
  
  #define XATTR_NAME_NFSV4_ACL "system.nfs4_acl"
@@ -4727,11 +4881,11 @@ static void nfs_fixup_referral_attributes(struct nfs_fattr *fattr)
         if (!(((fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID) ||
                (fattr->valid & NFS_ATTR_FATTR_FILEID)) &&
               (fattr->valid & NFS_ATTR_FATTR_FSID) &&
-             (fattr->valid & NFS_ATTR_FATTR_V4_REFERRAL)))
+             (fattr->valid & NFS_ATTR_FATTR_V4_LOCATIONS)))
                 return;
  
         fattr->valid |= NFS_ATTR_FATTR_TYPE | NFS_ATTR_FATTR_MODE |
-               NFS_ATTR_FATTR_NLINK;
+               NFS_ATTR_FATTR_NLINK | NFS_ATTR_FATTR_V4_REFERRAL;
         fattr->mode = S_IFDIR | S_IRUGO | S_IXUGO;
         fattr->nlink = 2;
  }
@@ -4798,7 +4952,8 @@ static int _nfs4_proc_secinfo(struct inode *dir, const struct qstr *name, struct
         return status;
  }
  
-int nfs4_proc_secinfo(struct inode *dir, const struct qstr *name, struct nfs4_secinfo_flavors *flavors)
+static int nfs4_proc_secinfo(struct inode *dir, const struct qstr *name,
+               struct nfs4_secinfo_flavors *flavors)
  {
         struct nfs4_exception exception = { };
         int err;
@@ -4852,6 +5007,7 @@ int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred)
  {
         nfs4_verifier verifier;
         struct nfs41_exchange_id_args args = {
+               .verifier = &verifier,
                 .client = clp,
                 .flags = EXCHGID4_FLAG_SUPP_MOVED_REFER,
         };
@@ -4865,15 +5021,11 @@ int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred)
                 .rpc_resp = &res,
                 .rpc_cred = cred,
         };
-       __be32 *p;
  
         dprintk("--> %s\n", __func__);
         BUG_ON(clp == NULL);
  
-       p = (u32 *)verifier.data;
-       *p++ = htonl((u32)clp->cl_boot_time.tv_sec);
-       *p = htonl((u32)clp->cl_boot_time.tv_nsec);
-       args.verifier = &verifier;
+       nfs4_construct_boot_verifier(clp, &verifier);
  
         args.id_len = scnprintf(args.id, sizeof(args.id),
                                 "%s/%s.%s/%u",
@@ -4888,10 +5040,23 @@ int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred)
                 goto out;
         }
  
+       res.impl_id = kzalloc(sizeof(struct nfs41_impl_id), GFP_KERNEL);
+       if (unlikely(!res.impl_id)) {
+               status = -ENOMEM;
+               goto out_server_scope;
+       }
+
         status = rpc_call_sync(clp->cl_rpcclient, &msg, RPC_TASK_TIMEOUT);
         if (!status)
                 status = nfs4_check_cl_exchange_flags(clp->cl_exchange_flags);
  
+       if (!status) {
+               /* use the most recent implementation id */
+               kfree(clp->impl_id);
+               clp->impl_id = res.impl_id;
+       } else
+               kfree(res.impl_id);
+
         if (!status) {
                 if (clp->server_scope &&
                     !nfs41_same_server_scope(clp->server_scope,
@@ -4908,8 +5073,16 @@ int nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred)
                         goto out;
                 }
         }
+
+out_server_scope:
         kfree(res.server_scope);
  out:
+       if (clp->impl_id)
+               dprintk("%s: Server Implementation ID: "
+                       "domain: %s, name: %s, date: %llu,%u\n",
+                       __func__, clp->impl_id->domain, clp->impl_id->name,
+                       clp->impl_id->date.seconds,
+                       clp->impl_id->date.nseconds);
         dprintk("<-- %s status= %d\n", __func__, status);
         return status;
  }
@@ -4933,7 +5106,7 @@ static void nfs4_get_lease_time_prepare(struct rpc_task *task,
            since we're invoked within one */
         ret = nfs41_setup_sequence(data->clp->cl_session,
                                    &data->args->la_seq_args,
-                                  &data->res->lr_seq_res, 0, task);
+                                  &data->res->lr_seq_res, task);
  
         BUG_ON(ret == -EAGAIN);
         rpc_call_start(task);
@@ -4966,7 +5139,7 @@ static void nfs4_get_lease_time_done(struct rpc_task *task, void *calldata)
         dprintk("<-- %s\n", __func__);
  }
  
-struct rpc_call_ops nfs4_get_lease_time_ops = {
+static const struct rpc_call_ops nfs4_get_lease_time_ops = {
         .rpc_call_prepare = nfs4_get_lease_time_prepare,
         .rpc_call_done = nfs4_get_lease_time_done,
  };
@@ -4997,6 +5170,7 @@ int nfs4_proc_get_lease_time(struct nfs_client *clp, struct nfs_fsinfo *fsinfo)
         };
         int status;
  
+       nfs41_init_sequence(&args.la_seq_args, &res.lr_seq_res, 0);
         dprintk("--> %s\n", __func__);
         task = rpc_run_task(&task_setup);
  
@@ -5113,13 +5287,13 @@ struct nfs4_session *nfs4_alloc_session(struct nfs_client *clp)
                 return NULL;
  
         tbl = &session->fc_slot_table;
-       tbl->highest_used_slotid = -1;
+       tbl->highest_used_slotid = NFS4_NO_SLOT;
         spin_lock_init(&tbl->slot_tbl_lock);
         rpc_init_priority_wait_queue(&tbl->slot_tbl_waitq, "ForeChannel Slot table");
         init_completion(&tbl->complete);
  
         tbl = &session->bc_slot_table;
-       tbl->highest_used_slotid = -1;
+       tbl->highest_used_slotid = NFS4_NO_SLOT;
         spin_lock_init(&tbl->slot_tbl_lock);
         rpc_init_wait_queue(&tbl->slot_tbl_waitq, "BackChannel Slot table");
         init_completion(&tbl->complete);
@@ -5132,11 +5306,16 @@ struct nfs4_session *nfs4_alloc_session(struct nfs_client *clp)
  
  void nfs4_destroy_session(struct nfs4_session *session)
  {
+       struct rpc_xprt *xprt;
+
         nfs4_proc_destroy_session(session);
+
+       rcu_read_lock();
+       xprt = rcu_dereference(session->clp->cl_rpcclient->cl_xprt);
+       rcu_read_unlock();
         dprintk("%s Destroy backchannel for xprt %p\n",
-               __func__, session->clp->cl_rpcclient->cl_xprt);
-       xprt_destroy_backchannel(session->clp->cl_rpcclient->cl_xprt,
-                               NFS41_BC_MIN_CALLBACKS);
+               __func__, xprt);
+       xprt_destroy_backchannel(xprt, NFS41_BC_MIN_CALLBACKS);
         nfs4_destroy_slot_tables(session);
         kfree(session);
  }
@@ -5164,7 +5343,7 @@ static void nfs4_init_channel_attrs(struct nfs41_create_session_args *args)
         args->fc_attrs.max_rqst_sz = mxrqst_sz;
         args->fc_attrs.max_resp_sz = mxresp_sz;
         args->fc_attrs.max_ops = NFS4_MAX_OPS;
-       args->fc_attrs.max_reqs = session->clp->cl_rpcclient->cl_xprt->max_reqs;
+       args->fc_attrs.max_reqs = max_session_slots;
  
         dprintk("%s: Fore Channel : max_rqst_sz=%u max_resp_sz=%u "
                 "max_ops=%u max_reqs=%u\n",
@@ -5204,6 +5383,8 @@ static int nfs4_verify_fore_channel_attrs(struct nfs41_create_session_args *args
                 return -EINVAL;
         if (rcvd->max_reqs == 0)
                 return -EINVAL;
+       if (rcvd->max_reqs > NFS4_MAX_SLOT_TABLE)
+               rcvd->max_reqs = NFS4_MAX_SLOT_TABLE;
         return 0;
  }
  
@@ -5219,9 +5400,9 @@ static int nfs4_verify_back_channel_attrs(struct nfs41_create_session_args *args
         if (rcvd->max_resp_sz_cached > sent->max_resp_sz_cached)
                 return -EINVAL;
         /* These would render the backchannel useless: */
-       if (rcvd->max_ops  == 0)
+       if (rcvd->max_ops != sent->max_ops)
                 return -EINVAL;
-       if (rcvd->max_reqs == 0)
+       if (rcvd->max_reqs != sent->max_reqs)
                 return -EINVAL;
         return 0;
  }
@@ -5324,7 +5505,7 @@ int nfs4_proc_destroy_session(struct nfs4_session *session)
  
         if (status)
                 printk(KERN_WARNING
-                       "Got error %d from the server on DESTROY_SESSION. "
+                       "NFS: Got error %d from the server on DESTROY_SESSION. "
                         "Session has been destroyed regardless...\n", status);
  
         dprintk("<-- nfs4_proc_destroy_session\n");
@@ -5447,7 +5628,7 @@ static void nfs41_sequence_prepare(struct rpc_task *task, void *data)
         args = task->tk_msg.rpc_argp;
         res = task->tk_msg.rpc_resp;
  
-       if (nfs41_setup_sequence(clp->cl_session, args, res, 0, task))
+       if (nfs41_setup_sequence(clp->cl_session, args, res, task))
                 return;
         rpc_call_start(task);
  }
@@ -5479,6 +5660,7 @@ static struct rpc_task *_nfs41_proc_sequence(struct nfs_client *clp, struct rpc_
                 nfs_put_client(clp);
                 return ERR_PTR(-ENOMEM);
         }
+       nfs41_init_sequence(&calldata->args, &calldata->res, 0);
         msg.rpc_argp = &calldata->args;
         msg.rpc_resp = &calldata->res;
         calldata->clp = clp;
@@ -5540,7 +5722,7 @@ static void nfs4_reclaim_complete_prepare(struct rpc_task *task, void *data)
         rpc_task_set_priority(task, RPC_PRIORITY_PRIVILEGED);
         if (nfs41_setup_sequence(calldata->clp->cl_session,
                                 &calldata->arg.seq_args,
-                               &calldata->res.seq_res, 0, task))
+                               &calldata->res.seq_res, task))
                 return;
  
         rpc_call_start(task);
@@ -5619,6 +5801,7 @@ static int nfs41_proc_reclaim_complete(struct nfs_client *clp)
         calldata->clp = clp;
         calldata->arg.one_fs = 0;
  
+       nfs41_init_sequence(&calldata->arg.seq_args, &calldata->res.seq_res, 0);
         msg.rpc_argp = &calldata->arg;
         msg.rpc_resp = &calldata->res;
         task_setup_data.callback_data = calldata;
@@ -5650,7 +5833,7 @@ nfs4_layoutget_prepare(struct rpc_task *task, void *calldata)
          * to be no way to prevent it completely.
          */
         if (nfs4_setup_sequence(server, &lgp->args.seq_args,
-                               &lgp->res.seq_res, 0, task))
+                               &lgp->res.seq_res, task))
                 return;
         if (pnfs_choose_layoutget_stateid(&lgp->args.stateid,
                                           NFS_I(lgp->args.inode)->layout,
@@ -5725,6 +5908,7 @@ int nfs4_proc_layoutget(struct nfs4_layoutget *lgp)
  
         lgp->res.layoutp = &lgp->args.layout;
         lgp->res.seq_res.sr_slot = NULL;
+       nfs41_init_sequence(&lgp->args.seq_args, &lgp->res.seq_res, 0);
         task = rpc_run_task(&task_setup_data);
         if (IS_ERR(task))
                 return PTR_ERR(task);
@@ -5745,7 +5929,7 @@ nfs4_layoutreturn_prepare(struct rpc_task *task, void *calldata)
  
         dprintk("--> %s\n", __func__);
         if (nfs41_setup_sequence(lrp->clp->cl_session, &lrp->args.seq_args,
-                               &lrp->res.seq_res, 0, task))
+                               &lrp->res.seq_res, task))
                 return;
         rpc_call_start(task);
  }
@@ -5811,6 +5995,7 @@ int nfs4_proc_layoutreturn(struct nfs4_layoutreturn *lrp)
         int status;
  
         dprintk("--> %s\n", __func__);
+       nfs41_init_sequence(&lrp->args.seq_args, &lrp->res.seq_res, 1);
         task = rpc_run_task(&task_setup_data);
         if (IS_ERR(task))
                 return PTR_ERR(task);
@@ -5911,7 +6096,7 @@ static void nfs4_layoutcommit_prepare(struct rpc_task *task, void *calldata)
         struct nfs_server *server = NFS_SERVER(data->args.inode);
  
         if (nfs4_setup_sequence(server, &data->args.seq_args,
-                               &data->res.seq_res, 1, task))
+                               &data->res.seq_res, task))
                 return;
         rpc_call_start(task);
  }
@@ -5998,6 +6183,7 @@ nfs4_proc_layoutcommit(struct nfs4_layoutcommit_data *data, bool sync)
                 data->args.lastbytewritten,
                 data->args.inode->i_ino);
  
+       nfs41_init_sequence(&data->args.seq_args, &data->res.seq_res, 1);
         task = rpc_run_task(&task_setup_data);
         if (IS_ERR(task))
                 return PTR_ERR(task);
@@ -6091,11 +6277,12 @@ out_freepage:
  out:
         return err;
  }
-static int _nfs41_test_stateid(struct nfs_server *server, struct nfs4_state *state)
+
+static int _nfs41_test_stateid(struct nfs_server *server, nfs4_stateid *stateid)
  {
         int status;
         struct nfs41_test_stateid_args args = {
-               .stateid = &state->stateid,
+               .stateid = stateid,
         };
         struct nfs41_test_stateid_res res;
         struct rpc_message msg = {
@@ -6103,28 +6290,31 @@ static int _nfs41_test_stateid(struct nfs_server *server, struct nfs4_state *sta
                 .rpc_argp = &args,
                 .rpc_resp = &res,
         };
-       args.seq_args.sa_session = res.seq_res.sr_session = NULL;
-       status = nfs4_call_sync_sequence(server->client, server, &msg, &args.seq_args, &res.seq_res, 0, 1);
+
+       nfs41_init_sequence(&args.seq_args, &res.seq_res, 0);
+       status = nfs4_call_sync_sequence(server->client, server, &msg, &args.seq_args, &res.seq_res, 1);
+
+       if (status == NFS_OK)
+               return res.status;
         return status;
  }
  
-static int nfs41_test_stateid(struct nfs_server *server, struct nfs4_state *state)
+static int nfs41_test_stateid(struct nfs_server *server, nfs4_stateid *stateid)
  {
         struct nfs4_exception exception = { };
         int err;
         do {
                 err = nfs4_handle_exception(server,
-                               _nfs41_test_stateid(server, state),
+                               _nfs41_test_stateid(server, stateid),
                                 &exception);
         } while (exception.retry);
         return err;
  }
  
-static int _nfs4_free_stateid(struct nfs_server *server, struct nfs4_state *state)
+static int _nfs4_free_stateid(struct nfs_server *server, nfs4_stateid *stateid)
  {
-       int status;
         struct nfs41_free_stateid_args args = {
-               .stateid = &state->stateid,
+               .stateid = stateid,
         };
         struct nfs41_free_stateid_res res;
         struct rpc_message msg = {
@@ -6133,25 +6323,46 @@ static int _nfs4_free_stateid(struct nfs_server *server, struct nfs4_state *stat
                 .rpc_resp = &res,
         };
  
-       args.seq_args.sa_session = res.seq_res.sr_session = NULL;
-       status = nfs4_call_sync_sequence(server->client, server, &msg, &args.seq_args, &res.seq_res, 0, 1);
-       return status;
+       nfs41_init_sequence(&args.seq_args, &res.seq_res, 0);
+       return nfs4_call_sync_sequence(server->client, server, &msg, &args.seq_args, &res.seq_res, 1);
  }
  
-static int nfs41_free_stateid(struct nfs_server *server, struct nfs4_state *state)
+static int nfs41_free_stateid(struct nfs_server *server, nfs4_stateid *stateid)
  {
         struct nfs4_exception exception = { };
         int err;
         do {
                 err = nfs4_handle_exception(server,
-                               _nfs4_free_stateid(server, state),
+                               _nfs4_free_stateid(server, stateid),
                                 &exception);
         } while (exception.retry);
         return err;
  }
+
+static bool nfs41_match_stateid(const nfs4_stateid *s1,
+               const nfs4_stateid *s2)
+{
+       if (memcmp(s1->other, s2->other, sizeof(s1->other)) != 0)
+               return false;
+
+       if (s1->seqid == s2->seqid)
+               return true;
+       if (s1->seqid == 0 || s2->seqid == 0)
+               return true;
+
+       return false;
+}
+
  #endif /* CONFIG_NFS_V4_1 */
  
-struct nfs4_state_recovery_ops nfs40_reboot_recovery_ops = {
+static bool nfs4_match_stateid(const nfs4_stateid *s1,
+               const nfs4_stateid *s2)
+{
+       return nfs4_stateid_match(s1, s2);
+}
+
+
+static const struct nfs4_state_recovery_ops nfs40_reboot_recovery_ops = {
         .owner_flag_bit = NFS_OWNER_RECLAIM_REBOOT,
         .state_flag_bit = NFS_STATE_RECLAIM_REBOOT,
         .recover_open   = nfs4_open_reclaim,
@@ -6161,7 +6372,7 @@ struct nfs4_state_recovery_ops nfs40_reboot_recovery_ops = {
  };
  
  #if defined(CONFIG_NFS_V4_1)
-struct nfs4_state_recovery_ops nfs41_reboot_recovery_ops = {
+static const struct nfs4_state_recovery_ops nfs41_reboot_recovery_ops = {
         .owner_flag_bit = NFS_OWNER_RECLAIM_REBOOT,
         .state_flag_bit = NFS_STATE_RECLAIM_REBOOT,
         .recover_open   = nfs4_open_reclaim,
@@ -6172,7 +6383,7 @@ struct nfs4_state_recovery_ops nfs41_reboot_recovery_ops = {
  };
  #endif /* CONFIG_NFS_V4_1 */
  
-struct nfs4_state_recovery_ops nfs40_nograce_recovery_ops = {
+static const struct nfs4_state_recovery_ops nfs40_nograce_recovery_ops = {
         .owner_flag_bit = NFS_OWNER_RECLAIM_NOGRACE,
         .state_flag_bit = NFS_STATE_RECLAIM_NOGRACE,
         .recover_open   = nfs4_open_expired,
@@ -6182,7 +6393,7 @@ struct nfs4_state_recovery_ops nfs40_nograce_recovery_ops = {
  };
  
  #if defined(CONFIG_NFS_V4_1)
-struct nfs4_state_recovery_ops nfs41_nograce_recovery_ops = {
+static const struct nfs4_state_recovery_ops nfs41_nograce_recovery_ops = {
         .owner_flag_bit = NFS_OWNER_RECLAIM_NOGRACE,
         .state_flag_bit = NFS_STATE_RECLAIM_NOGRACE,
         .recover_open   = nfs41_open_expired,
@@ -6192,14 +6403,14 @@ struct nfs4_state_recovery_ops nfs41_nograce_recovery_ops = {
  };
  #endif /* CONFIG_NFS_V4_1 */
  
-struct nfs4_state_maintenance_ops nfs40_state_renewal_ops = {
+static const struct nfs4_state_maintenance_ops nfs40_state_renewal_ops = {
         .sched_state_renewal = nfs4_proc_async_renew,
         .get_state_renewal_cred_locked = nfs4_get_renew_cred_locked,
         .renew_lease = nfs4_proc_renew,
  };
  
  #if defined(CONFIG_NFS_V4_1)
-struct nfs4_state_maintenance_ops nfs41_state_renewal_ops = {
+static const struct nfs4_state_maintenance_ops nfs41_state_renewal_ops = {
         .sched_state_renewal = nfs41_proc_async_sequence,
         .get_state_renewal_cred_locked = nfs4_get_machine_cred_locked,
         .renew_lease = nfs4_proc_sequence,
@@ -6209,7 +6420,7 @@ struct nfs4_state_maintenance_ops nfs41_state_renewal_ops = {
  static const struct nfs4_minor_version_ops nfs_v4_0_minor_ops = {
         .minor_version = 0,
         .call_sync = _nfs4_call_sync,
-       .validate_stateid = nfs4_validate_delegation_stateid,
+       .match_stateid = nfs4_match_stateid,
         .find_root_sec = nfs4_find_root_sec,
         .reboot_recovery_ops = &nfs40_reboot_recovery_ops,
         .nograce_recovery_ops = &nfs40_nograce_recovery_ops,
@@ -6220,7 +6431,7 @@ static const struct nfs4_minor_version_ops nfs_v4_0_minor_ops = {
  static const struct nfs4_minor_version_ops nfs_v4_1_minor_ops = {
         .minor_version = 1,
         .call_sync = _nfs4_call_sync_session,
-       .validate_stateid = nfs41_validate_delegation_stateid,
+       .match_stateid = nfs41_match_stateid,
         .find_root_sec = nfs41_find_root_sec,
         .reboot_recovery_ops = &nfs41_reboot_recovery_ops,
         .nograce_recovery_ops = &nfs41_nograce_recovery_ops,
@@ -6260,9 +6471,11 @@ const struct nfs_rpc_ops nfs_v4_clientops = {
         .create         = nfs4_proc_create,
         .remove         = nfs4_proc_remove,
         .unlink_setup   = nfs4_proc_unlink_setup,
+       .unlink_rpc_prepare = nfs4_proc_unlink_rpc_prepare,
         .unlink_done    = nfs4_proc_unlink_done,
         .rename         = nfs4_proc_rename,
         .rename_setup   = nfs4_proc_rename_setup,
+       .rename_rpc_prepare = nfs4_proc_rename_rpc_prepare,
         .rename_done    = nfs4_proc_rename_done,
         .link           = nfs4_proc_link,
         .symlink        = nfs4_proc_symlink,
@@ -6276,8 +6489,10 @@ const struct nfs_rpc_ops nfs_v4_clientops = {
         .set_capabilities = nfs4_server_capabilities,
         .decode_dirent  = nfs4_decode_dirent,
         .read_setup     = nfs4_proc_read_setup,
+       .read_rpc_prepare = nfs4_proc_read_rpc_prepare,
         .read_done      = nfs4_read_done,
         .write_setup    = nfs4_proc_write_setup,
+       .write_rpc_prepare = nfs4_proc_write_rpc_prepare,
         .write_done     = nfs4_write_done,
         .commit_setup   = nfs4_proc_commit_setup,
         .commit_done    = nfs4_commit_done,
@@ -6301,6 +6516,10 @@ const struct xattr_handler *nfs4_xattr_handlers[] = {
         NULL
  };
  
+module_param(max_session_slots, ushort, 0644);
+MODULE_PARM_DESC(max_session_slots, "Maximum number of outstanding NFSv4.1 "
+               "requests the client will negotiate");
+
  /*
   * Local variables:
   *  c-basic-offset: 8
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c

index 45392032e7bd60b85b00fb74f86ca99a603e31d4..0f43414eb25a141be336c34bef78cc126cd9039f 100644 (file)
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -146,6 +146,11 @@ struct rpc_cred *nfs4_get_renew_cred_locked(struct nfs_client *clp)
         struct rpc_cred *cred = NULL;
         struct nfs_server *server;
  
+       /* Use machine credentials if available */
+       cred = nfs4_get_machine_cred_locked(clp);
+       if (cred != NULL)
+               goto out;
+
         rcu_read_lock();
         list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
                 cred = nfs4_get_renew_cred_server_locked(server);
@@ -153,6 +158,8 @@ struct rpc_cred *nfs4_get_renew_cred_locked(struct nfs_client *clp)
                         break;
         }
         rcu_read_unlock();
+
+out:
         return cred;
  }
  
@@ -190,30 +197,29 @@ static int nfs41_setup_state_renewal(struct nfs_client *clp)
  static void nfs4_end_drain_session(struct nfs_client *clp)
  {
         struct nfs4_session *ses = clp->cl_session;
+       struct nfs4_slot_table *tbl;
         int max_slots;
  
         if (ses == NULL)
                 return;
+       tbl = &ses->fc_slot_table;
         if (test_and_clear_bit(NFS4_SESSION_DRAINING, &ses->session_state)) {
-               spin_lock(&ses->fc_slot_table.slot_tbl_lock);
-               max_slots = ses->fc_slot_table.max_slots;
+               spin_lock(&tbl->slot_tbl_lock);
+               max_slots = tbl->max_slots;
                 while (max_slots--) {
-                       struct rpc_task *task;
-
-                       task = rpc_wake_up_next(&ses->fc_slot_table.
-                                               slot_tbl_waitq);
-                       if (!task)
+                       if (rpc_wake_up_first(&tbl->slot_tbl_waitq,
+                                               nfs4_set_task_privileged,
+                                               NULL) == NULL)
                                 break;
-                       rpc_task_set_priority(task, RPC_PRIORITY_PRIVILEGED);
                 }
-               spin_unlock(&ses->fc_slot_table.slot_tbl_lock);
+               spin_unlock(&tbl->slot_tbl_lock);
         }
  }
  
  static int nfs4_wait_on_slot_tbl(struct nfs4_slot_table *tbl)
  {
         spin_lock(&tbl->slot_tbl_lock);
-       if (tbl->highest_used_slotid != -1) {
+       if (tbl->highest_used_slotid != NFS4_NO_SLOT) {
                 INIT_COMPLETION(tbl->complete);
                 spin_unlock(&tbl->slot_tbl_lock);
                 return wait_for_completion_interruptible(&tbl->complete);
@@ -317,62 +323,6 @@ out:
         return cred;
  }
  
-static void nfs_alloc_unique_id_locked(struct rb_root *root,
-                                      struct nfs_unique_id *new,
-                                      __u64 minval, int maxbits)
-{
-       struct rb_node **p, *parent;
-       struct nfs_unique_id *pos;
-       __u64 mask = ~0ULL;
-
-       if (maxbits < 64)
-               mask = (1ULL << maxbits) - 1ULL;
-
-       /* Ensure distribution is more or less flat */
-       get_random_bytes(&new->id, sizeof(new->id));
-       new->id &= mask;
-       if (new->id < minval)
-               new->id += minval;
-retry:
-       p = &root->rb_node;
-       parent = NULL;
-
-       while (*p != NULL) {
-               parent = *p;
-               pos = rb_entry(parent, struct nfs_unique_id, rb_node);
-
-               if (new->id < pos->id)
-                       p = &(*p)->rb_left;
-               else if (new->id > pos->id)
-                       p = &(*p)->rb_right;
-               else
-                       goto id_exists;
-       }
-       rb_link_node(&new->rb_node, parent, p);
-       rb_insert_color(&new->rb_node, root);
-       return;
-id_exists:
-       for (;;) {
-               new->id++;
-               if (new->id < minval || (new->id & mask) != new->id) {
-                       new->id = minval;
-                       break;
-               }
-               parent = rb_next(parent);
-               if (parent == NULL)
-                       break;
-               pos = rb_entry(parent, struct nfs_unique_id, rb_node);
-               if (new->id < pos->id)
-                       break;
-       }
-       goto retry;
-}
-
-static void nfs_free_unique_id(struct rb_root *root, struct nfs_unique_id *id)
-{
-       rb_erase(&id->rb_node, root);
-}
-
  static struct nfs4_state_owner *
  nfs4_find_state_owner_locked(struct nfs_server *server, struct rpc_cred *cred)
  {
@@ -405,6 +355,7 @@ nfs4_insert_state_owner_locked(struct nfs4_state_owner *new)
         struct rb_node **p = &server->state_owners.rb_node,
                        *parent = NULL;
         struct nfs4_state_owner *sp;
+       int err;
  
         while (*p != NULL) {
                 parent = *p;
@@ -421,8 +372,9 @@ nfs4_insert_state_owner_locked(struct nfs4_state_owner *new)
                         return sp;
                 }
         }
-       nfs_alloc_unique_id_locked(&server->openowner_id,
-                                       &new->so_owner_id, 1, 64);
+       err = ida_get_new(&server->openowner_id, &new->so_seqid.owner_id);
+       if (err)
+               return ERR_PTR(err);
         rb_link_node(&new->so_server_node, parent, p);
         rb_insert_color(&new->so_server_node, &server->state_owners);
         return new;
@@ -435,7 +387,23 @@ nfs4_remove_state_owner_locked(struct nfs4_state_owner *sp)
  
         if (!RB_EMPTY_NODE(&sp->so_server_node))
                 rb_erase(&sp->so_server_node, &server->state_owners);
-       nfs_free_unique_id(&server->openowner_id, &sp->so_owner_id);
+       ida_remove(&server->openowner_id, sp->so_seqid.owner_id);
+}
+
+static void
+nfs4_init_seqid_counter(struct nfs_seqid_counter *sc)
+{
+       sc->flags = 0;
+       sc->counter = 0;
+       spin_lock_init(&sc->lock);
+       INIT_LIST_HEAD(&sc->list);
+       rpc_init_wait_queue(&sc->wait, "Seqid_waitqueue");
+}
+
+static void
+nfs4_destroy_seqid_counter(struct nfs_seqid_counter *sc)
+{
+       rpc_destroy_wait_queue(&sc->wait);
  }
  
  /*
@@ -444,19 +412,20 @@ nfs4_remove_state_owner_locked(struct nfs4_state_owner *sp)
   *
   */
  static struct nfs4_state_owner *
-nfs4_alloc_state_owner(void)
+nfs4_alloc_state_owner(struct nfs_server *server,
+               struct rpc_cred *cred,
+               gfp_t gfp_flags)
  {
         struct nfs4_state_owner *sp;
  
-       sp = kzalloc(sizeof(*sp),GFP_NOFS);
+       sp = kzalloc(sizeof(*sp), gfp_flags);
         if (!sp)
                 return NULL;
+       sp->so_server = server;
+       sp->so_cred = get_rpccred(cred);
         spin_lock_init(&sp->so_lock);
         INIT_LIST_HEAD(&sp->so_states);
-       rpc_init_wait_queue(&sp->so_sequence.wait, "Seqid_waitqueue");
-       sp->so_seqid.sequence = &sp->so_sequence;
-       spin_lock_init(&sp->so_sequence.lock);
-       INIT_LIST_HEAD(&sp->so_sequence.list);
+       nfs4_init_seqid_counter(&sp->so_seqid);
         atomic_set(&sp->so_count, 1);
         INIT_LIST_HEAD(&sp->so_lru);
         return sp;
@@ -478,7 +447,7 @@ nfs4_drop_state_owner(struct nfs4_state_owner *sp)
  
  static void nfs4_free_state_owner(struct nfs4_state_owner *sp)
  {
-       rpc_destroy_wait_queue(&sp->so_sequence.wait);
+       nfs4_destroy_seqid_counter(&sp->so_seqid);
         put_rpccred(sp->so_cred);
         kfree(sp);
  }
@@ -516,7 +485,8 @@ static void nfs4_gc_state_owners(struct nfs_server *server)
   * Returns a pointer to an instantiated nfs4_state_owner struct, or NULL.
   */
  struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *server,
-                                             struct rpc_cred *cred)
+                                             struct rpc_cred *cred,
+                                             gfp_t gfp_flags)
  {
         struct nfs_client *clp = server->nfs_client;
         struct nfs4_state_owner *sp, *new;
@@ -526,20 +496,18 @@ struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *server,
         spin_unlock(&clp->cl_lock);
         if (sp != NULL)
                 goto out;
-       new = nfs4_alloc_state_owner();
+       new = nfs4_alloc_state_owner(server, cred, gfp_flags);
         if (new == NULL)
                 goto out;
-       new->so_server = server;
-       new->so_cred = cred;
-       spin_lock(&clp->cl_lock);
-       sp = nfs4_insert_state_owner_locked(new);
-       spin_unlock(&clp->cl_lock);
-       if (sp == new)
-               get_rpccred(cred);
-       else {
-               rpc_destroy_wait_queue(&new->so_sequence.wait);
-               kfree(new);
-       }
+       do {
+               if (ida_pre_get(&server->openowner_id, gfp_flags) == 0)
+                       break;
+               spin_lock(&clp->cl_lock);
+               sp = nfs4_insert_state_owner_locked(new);
+               spin_unlock(&clp->cl_lock);
+       } while (sp == ERR_PTR(-EAGAIN));
+       if (sp != new)
+               nfs4_free_state_owner(new);
  out:
         nfs4_gc_state_owners(server);
         return sp;
@@ -795,15 +763,11 @@ static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, f
  {
         struct nfs4_lock_state *lsp;
         struct nfs_server *server = state->owner->so_server;
-       struct nfs_client *clp = server->nfs_client;
  
         lsp = kzalloc(sizeof(*lsp), GFP_NOFS);
         if (lsp == NULL)
                 return NULL;
-       rpc_init_wait_queue(&lsp->ls_sequence.wait, "lock_seqid_waitqueue");
-       spin_lock_init(&lsp->ls_sequence.lock);
-       INIT_LIST_HEAD(&lsp->ls_sequence.list);
-       lsp->ls_seqid.sequence = &lsp->ls_sequence;
+       nfs4_init_seqid_counter(&lsp->ls_seqid);
         atomic_set(&lsp->ls_count, 1);
         lsp->ls_state = state;
         lsp->ls_owner.lo_type = type;
@@ -815,25 +779,22 @@ static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, f
                 lsp->ls_owner.lo_u.posix_owner = fl_owner;
                 break;
         default:
-               kfree(lsp);
-               return NULL;
+               goto out_free;
         }
-       spin_lock(&clp->cl_lock);
-       nfs_alloc_unique_id_locked(&server->lockowner_id, &lsp->ls_id, 1, 64);
-       spin_unlock(&clp->cl_lock);
+       lsp->ls_seqid.owner_id = ida_simple_get(&server->lockowner_id, 0, 0, GFP_NOFS);
+       if (lsp->ls_seqid.owner_id < 0)
+               goto out_free;
         INIT_LIST_HEAD(&lsp->ls_locks);
         return lsp;
+out_free:
+       kfree(lsp);
+       return NULL;
  }
  
-static void nfs4_free_lock_state(struct nfs4_lock_state *lsp)
+void nfs4_free_lock_state(struct nfs_server *server, struct nfs4_lock_state *lsp)
  {
-       struct nfs_server *server = lsp->ls_state->owner->so_server;
-       struct nfs_client *clp = server->nfs_client;
-
-       spin_lock(&clp->cl_lock);
-       nfs_free_unique_id(&server->lockowner_id, &lsp->ls_id);
-       spin_unlock(&clp->cl_lock);
-       rpc_destroy_wait_queue(&lsp->ls_sequence.wait);
+       ida_simple_remove(&server->lockowner_id, lsp->ls_seqid.owner_id);
+       nfs4_destroy_seqid_counter(&lsp->ls_seqid);
         kfree(lsp);
  }
  
@@ -865,7 +826,7 @@ static struct nfs4_lock_state *nfs4_get_lock_state(struct nfs4_state *state, fl_
         }
         spin_unlock(&state->state_lock);
         if (new != NULL)
-               nfs4_free_lock_state(new);
+               nfs4_free_lock_state(state->owner->so_server, new);
         return lsp;
  }
  
@@ -886,9 +847,11 @@ void nfs4_put_lock_state(struct nfs4_lock_state *lsp)
         if (list_empty(&state->lock_states))
                 clear_bit(LK_STATE_IN_USE, &state->flags);
         spin_unlock(&state->state_lock);
-       if (lsp->ls_flags & NFS_LOCK_INITIALIZED)
-               nfs4_release_lockowner(lsp);
-       nfs4_free_lock_state(lsp);
+       if (lsp->ls_flags & NFS_LOCK_INITIALIZED) {
+               if (nfs4_release_lockowner(lsp) == 0)
+                       return;
+       }
+       nfs4_free_lock_state(lsp->ls_state->owner->so_server, lsp);
  }
  
  static void nfs4_fl_copy_lock(struct file_lock *dst, struct file_lock *src)
@@ -918,7 +881,8 @@ int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl)
         if (fl->fl_flags & FL_POSIX)
                 lsp = nfs4_get_lock_state(state, fl->fl_owner, 0, NFS4_POSIX_LOCK_TYPE);
         else if (fl->fl_flags & FL_FLOCK)
-               lsp = nfs4_get_lock_state(state, 0, fl->fl_pid, NFS4_FLOCK_LOCK_TYPE);
+               lsp = nfs4_get_lock_state(state, NULL, fl->fl_pid,
+                               NFS4_FLOCK_LOCK_TYPE);
         else
                 return -EINVAL;
         if (lsp == NULL)
@@ -928,28 +892,49 @@ int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl)
         return 0;
  }
  
-/*
- * Byte-range lock aware utility to initialize the stateid of read/write
- * requests.
- */
-void nfs4_copy_stateid(nfs4_stateid *dst, struct nfs4_state *state, fl_owner_t fl_owner, pid_t fl_pid)
+static bool nfs4_copy_lock_stateid(nfs4_stateid *dst, struct nfs4_state *state,
+               fl_owner_t fl_owner, pid_t fl_pid)
  {
         struct nfs4_lock_state *lsp;
-       int seq;
+       bool ret = false;
  
-       do {
-               seq = read_seqbegin(&state->seqlock);
-               memcpy(dst, &state->stateid, sizeof(*dst));
-       } while (read_seqretry(&state->seqlock, seq));
         if (test_bit(LK_STATE_IN_USE, &state->flags) == 0)
-               return;
+               goto out;
  
         spin_lock(&state->state_lock);
         lsp = __nfs4_find_lock_state(state, fl_owner, fl_pid, NFS4_ANY_LOCK_TYPE);
-       if (lsp != NULL && (lsp->ls_flags & NFS_LOCK_INITIALIZED) != 0)
-               memcpy(dst, &lsp->ls_stateid, sizeof(*dst));
+       if (lsp != NULL && (lsp->ls_flags & NFS_LOCK_INITIALIZED) != 0) {
+               nfs4_stateid_copy(dst, &lsp->ls_stateid);
+               ret = true;
+       }
         spin_unlock(&state->state_lock);
         nfs4_put_lock_state(lsp);
+out:
+       return ret;
+}
+
+static void nfs4_copy_open_stateid(nfs4_stateid *dst, struct nfs4_state *state)
+{
+       int seq;
+
+       do {
+               seq = read_seqbegin(&state->seqlock);
+               nfs4_stateid_copy(dst, &state->stateid);
+       } while (read_seqretry(&state->seqlock, seq));
+}
+
+/*
+ * Byte-range lock aware utility to initialize the stateid of read/write
+ * requests.
+ */
+void nfs4_select_rw_stateid(nfs4_stateid *dst, struct nfs4_state *state,
+               fmode_t fmode, fl_owner_t fl_owner, pid_t fl_pid)
+{
+       if (nfs4_copy_delegation_stateid(dst, state->inode, fmode))
+               return;
+       if (nfs4_copy_lock_stateid(dst, state, fl_owner, fl_pid))
+               return;
+       nfs4_copy_open_stateid(dst, state);
  }
  
  struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_mask)
@@ -960,20 +945,28 @@ struct nfs_seqid *nfs_alloc_seqid(struct nfs_seqid_counter *counter, gfp_t gfp_m
         if (new != NULL) {
                 new->sequence = counter;
                 INIT_LIST_HEAD(&new->list);
+               new->task = NULL;
         }
         return new;
  }
  
  void nfs_release_seqid(struct nfs_seqid *seqid)
  {
-       if (!list_empty(&seqid->list)) {
-               struct rpc_sequence *sequence = seqid->sequence->sequence;
+       struct nfs_seqid_counter *sequence;
  
-               spin_lock(&sequence->lock);
-               list_del_init(&seqid->list);
-               spin_unlock(&sequence->lock);
-               rpc_wake_up(&sequence->wait);
+       if (list_empty(&seqid->list))
+               return;
+       sequence = seqid->sequence;
+       spin_lock(&sequence->lock);
+       list_del_init(&seqid->list);
+       if (!list_empty(&sequence->list)) {
+               struct nfs_seqid *next;
+
+               next = list_first_entry(&sequence->list,
+                               struct nfs_seqid, list);
+               rpc_wake_up_queued_task(&sequence->wait, next->task);
         }
+       spin_unlock(&sequence->lock);
  }
  
  void nfs_free_seqid(struct nfs_seqid *seqid)
@@ -989,14 +982,14 @@ void nfs_free_seqid(struct nfs_seqid *seqid)
   */
  static void nfs_increment_seqid(int status, struct nfs_seqid *seqid)
  {
-       BUG_ON(list_first_entry(&seqid->sequence->sequence->list, struct nfs_seqid, list) != seqid);
+       BUG_ON(list_first_entry(&seqid->sequence->list, struct nfs_seqid, list) != seqid);
         switch (status) {
                 case 0:
                         break;
                 case -NFS4ERR_BAD_SEQID:
                         if (seqid->sequence->flags & NFS_SEQID_CONFIRMED)
                                 return;
-                       printk(KERN_WARNING "NFS: v4 server returned a bad"
+                       pr_warn_ratelimited("NFS: v4 server returned a bad"
                                         " sequence-id error on an"
                                         " unconfirmed sequence %p!\n",
                                         seqid->sequence);
@@ -1040,10 +1033,11 @@ void nfs_increment_lock_seqid(int status, struct nfs_seqid *seqid)
  
  int nfs_wait_on_sequence(struct nfs_seqid *seqid, struct rpc_task *task)
  {
-       struct rpc_sequence *sequence = seqid->sequence->sequence;
+       struct nfs_seqid_counter *sequence = seqid->sequence;
         int status = 0;
  
         spin_lock(&sequence->lock);
+       seqid->task = task;
         if (list_empty(&seqid->list))
                 list_add_tail(&seqid->list, &sequence->list);
         if (list_first_entry(&sequence->list, struct nfs_seqid, list) == seqid)
@@ -1072,19 +1066,28 @@ static void nfs4_clear_state_manager_bit(struct nfs_client *clp)
  void nfs4_schedule_state_manager(struct nfs_client *clp)
  {
         struct task_struct *task;
+       char buf[INET6_ADDRSTRLEN + sizeof("-manager") + 1];
  
         if (test_and_set_bit(NFS4CLNT_MANAGER_RUNNING, &clp->cl_state) != 0)
                 return;
         __module_get(THIS_MODULE);
         atomic_inc(&clp->cl_count);
-       task = kthread_run(nfs4_run_state_manager, clp, "%s-manager",
-                               rpc_peeraddr2str(clp->cl_rpcclient,
-                                                       RPC_DISPLAY_ADDR));
-       if (!IS_ERR(task))
-               return;
-       nfs4_clear_state_manager_bit(clp);
-       nfs_put_client(clp);
-       module_put(THIS_MODULE);
+
+       /* The rcu_read_lock() is not strictly necessary, as the state
+        * manager is the only thread that ever changes the rpc_xprt
+        * after it's initialized.  At this point, we're single threaded. */
+       rcu_read_lock();
+       snprintf(buf, sizeof(buf), "%s-manager",
+                       rpc_peeraddr2str(clp->cl_rpcclient, RPC_DISPLAY_ADDR));
+       rcu_read_unlock();
+       task = kthread_run(nfs4_run_state_manager, clp, buf);
+       if (IS_ERR(task)) {
+               printk(KERN_ERR "%s: kthread_run: %ld\n",
+                       __func__, PTR_ERR(task));
+               nfs4_clear_state_manager_bit(clp);
+               nfs_put_client(clp);
+               module_put(THIS_MODULE);
+       }
  }
  
  /*
@@ -1098,10 +1101,25 @@ void nfs4_schedule_lease_recovery(struct nfs_client *clp)
                 set_bit(NFS4CLNT_CHECK_LEASE, &clp->cl_state);
         nfs4_schedule_state_manager(clp);
  }
+EXPORT_SYMBOL_GPL(nfs4_schedule_lease_recovery);
+
+/*
+ * nfs40_handle_cb_pathdown - return all delegations after NFS4ERR_CB_PATH_DOWN
+ * @clp: client to process
+ *
+ * Set the NFS4CLNT_LEASE_EXPIRED state in order to force a
+ * resend of the SETCLIENTID and hence re-establish the
+ * callback channel. Then return all existing delegations.
+ */
+static void nfs40_handle_cb_pathdown(struct nfs_client *clp)
+{
+       set_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state);
+       nfs_expire_all_delegations(clp);
+}
  
  void nfs4_schedule_path_down_recovery(struct nfs_client *clp)
  {
-       nfs_handle_cb_pathdown(clp);
+       nfs40_handle_cb_pathdown(clp);
         nfs4_schedule_state_manager(clp);
  }
  
@@ -1132,11 +1150,37 @@ void nfs4_schedule_stateid_recovery(const struct nfs_server *server, struct nfs4
  {
         struct nfs_client *clp = server->nfs_client;
  
-       if (test_and_clear_bit(NFS_DELEGATED_STATE, &state->flags))
-               nfs_async_inode_return_delegation(state->inode, &state->stateid);
         nfs4_state_mark_reclaim_nograce(clp, state);
         nfs4_schedule_state_manager(clp);
  }
+EXPORT_SYMBOL_GPL(nfs4_schedule_stateid_recovery);
+
+void nfs_inode_find_state_and_recover(struct inode *inode,
+               const nfs4_stateid *stateid)
+{
+       struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
+       struct nfs_inode *nfsi = NFS_I(inode);
+       struct nfs_open_context *ctx;
+       struct nfs4_state *state;
+       bool found = false;
+
+       spin_lock(&inode->i_lock);
+       list_for_each_entry(ctx, &nfsi->open_files, list) {
+               state = ctx->state;
+               if (state == NULL)
+                       continue;
+               if (!test_bit(NFS_DELEGATED_STATE, &state->flags))
+                       continue;
+               if (!nfs4_stateid_match(&state->stateid, stateid))
+                       continue;
+               nfs4_state_mark_reclaim_nograce(clp, state);
+               found = true;
+       }
+       spin_unlock(&inode->i_lock);
+       if (found)
+               nfs4_schedule_state_manager(clp);
+}
+
  
  static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_recovery_ops *ops)
  {
@@ -1175,8 +1219,8 @@ static int nfs4_reclaim_locks(struct nfs4_state *state, const struct nfs4_state_
                         case -NFS4ERR_CONN_NOT_BOUND_TO_SESSION:
                                 goto out;
                         default:
-                               printk(KERN_ERR "%s: unhandled error %d. Zeroing state\n",
-                                               __func__, status);
+                               printk(KERN_ERR "NFS: %s: unhandled error %d. "
+                                       "Zeroing state\n", __func__, status);
                         case -ENOMEM:
                         case -NFS4ERR_DENIED:
                         case -NFS4ERR_RECLAIM_BAD:
@@ -1222,8 +1266,9 @@ restart:
                                 spin_lock(&state->state_lock);
                                 list_for_each_entry(lock, &state->lock_states, ls_locks) {
                                         if (!(lock->ls_flags & NFS_LOCK_INITIALIZED))
-                                               printk("%s: Lock reclaim failed!\n",
-                                                       __func__);
+                                               pr_warn_ratelimited("NFS: "
+                                                       "%s: Lock reclaim "
+                                                       "failed!\n", __func__);
                                 }
                                 spin_unlock(&state->state_lock);
                                 nfs4_put_open_state(state);
@@ -1232,8 +1277,8 @@ restart:
                 }
                 switch (status) {
                         default:
-                               printk(KERN_ERR "%s: unhandled error %d. Zeroing state\n",
-                                               __func__, status);
+                               printk(KERN_ERR "NFS: %s: unhandled error %d. "
+                                       "Zeroing state\n", __func__, status);
                         case -ENOENT:
                         case -ENOMEM:
                         case -ESTALE:
@@ -1241,8 +1286,8 @@ restart:
                                  * Open state on this file cannot be recovered
                                  * All we can do is revert to using the zero stateid.
                                  */
-                               memset(state->stateid.data, 0,
-                                       sizeof(state->stateid.data));
+                               memset(&state->stateid, 0,
+                                       sizeof(state->stateid));
                                 /* Mark the file as being 'closed' */
                                 state->state = 0;
                                 break;
@@ -1420,7 +1465,7 @@ static int nfs4_recovery_handle_error(struct nfs_client *clp, int error)
                 case 0:
                         break;
                 case -NFS4ERR_CB_PATH_DOWN:
-                       nfs_handle_cb_pathdown(clp);
+                       nfs40_handle_cb_pathdown(clp);
                         break;
                 case -NFS4ERR_NO_GRACE:
                         nfs4_state_end_reclaim_reboot(clp);
@@ -1801,7 +1846,7 @@ static void nfs4_state_manager(struct nfs_client *clp)
         } while (atomic_read(&clp->cl_count) > 1);
         return;
  out_error:
-       printk(KERN_WARNING "Error: state manager failed on NFSv4 server %s"
+       pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s"
                         " with error %d\n", clp->cl_hostname, -status);
         nfs4_end_drain_session(clp);
         nfs4_clear_state_manager_bit(clp);
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c

index 33bd8d0f745d8baaa11b41fc3fcffde52ee3f02a..c74fdb114b48af141a719d1facd11ed249c5f5d1 100644 (file)
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -44,6 +44,8 @@
  #include <linux/pagemap.h>
  #include <linux/proc_fs.h>
  #include <linux/kdev_t.h>
+#include <linux/module.h>
+#include <linux/utsname.h>
  #include <linux/sunrpc/clnt.h>
  #include <linux/sunrpc/msg_prot.h>
  #include <linux/sunrpc/gss_api.h>
@@ -271,7 +273,12 @@ static int nfs4_stat_to_errno(int);
                                 1 /* flags */ + \
                                 1 /* spa_how */ + \
                                 0 /* SP4_NONE (for now) */ + \
-                               1 /* zero implemetation id array */)
+                               1 /* implementation id array of size 1 */ + \
+                               1 /* nii_domain */ + \
+                               XDR_QUADLEN(NFS4_OPAQUE_LIMIT) + \
+                               1 /* nii_name */ + \
+                               XDR_QUADLEN(NFS4_OPAQUE_LIMIT) + \
+                               3 /* nii_date */)
  #define decode_exchange_id_maxsz (op_decode_hdr_maxsz + \
                                 2 /* eir_clientid */ + \
                                 1 /* eir_sequenceid */ + \
@@ -284,7 +291,11 @@ static int nfs4_stat_to_errno(int);
                                 /* eir_server_scope<> */ \
                                 XDR_QUADLEN(NFS4_OPAQUE_LIMIT) + 1 + \
                                 1 /* eir_server_impl_id array length */ + \
-                               0 /* ignored eir_server_impl_id contents */)
+                               1 /* nii_domain */ + \
+                               XDR_QUADLEN(NFS4_OPAQUE_LIMIT) + \
+                               1 /* nii_name */ + \
+                               XDR_QUADLEN(NFS4_OPAQUE_LIMIT) + \
+                               3 /* nii_date */)
  #define encode_channel_attrs_maxsz  (6 + 1 /* ca_rdma_ird.len (0) */)
  #define decode_channel_attrs_maxsz  (6 + \
                                      1 /* ca_rdma_ird.len */ + \
@@ -838,6 +849,12 @@ const u32 nfs41_maxread_overhead = ((RPC_MAX_HEADER_WITH_AUTH +
                                     XDR_UNIT);
  #endif /* CONFIG_NFS_V4_1 */
  
+static unsigned short send_implementation_id = 1;
+
+module_param(send_implementation_id, ushort, 0644);
+MODULE_PARM_DESC(send_implementation_id,
+               "Send implementation ID with NFSv4.1 exchange_id");
+
  static const umode_t nfs_type2fmt[] = {
         [NF4BAD] = 0,
         [NF4REG] = S_IFREG,
@@ -868,15 +885,44 @@ static __be32 *reserve_space(struct xdr_stream *xdr, size_t nbytes)
         return p;
  }
  
+static void encode_opaque_fixed(struct xdr_stream *xdr, const void *buf, size_t len)
+{
+       __be32 *p;
+
+       p = xdr_reserve_space(xdr, len);
+       xdr_encode_opaque_fixed(p, buf, len);
+}
+
  static void encode_string(struct xdr_stream *xdr, unsigned int len, const char *str)
  {
         __be32 *p;
  
-       p = xdr_reserve_space(xdr, 4 + len);
-       BUG_ON(p == NULL);
+       p = reserve_space(xdr, 4 + len);
         xdr_encode_opaque(p, str, len);
  }
  
+static void encode_uint32(struct xdr_stream *xdr, u32 n)
+{
+       __be32 *p;
+
+       p = reserve_space(xdr, 4);
+       *p = cpu_to_be32(n);
+}
+
+static void encode_uint64(struct xdr_stream *xdr, u64 n)
+{
+       __be32 *p;
+
+       p = reserve_space(xdr, 8);
+       xdr_encode_hyper(p, n);
+}
+
+static void encode_nfs4_seqid(struct xdr_stream *xdr,
+               const struct nfs_seqid *seqid)
+{
+       encode_uint32(xdr, seqid->sequence->counter);
+}
+
  static void encode_compound_hdr(struct xdr_stream *xdr,
                                 struct rpc_rqst *req,
                                 struct compound_hdr *hdr)
@@ -889,28 +935,37 @@ static void encode_compound_hdr(struct xdr_stream *xdr,
          * but this is not required as a MUST for the server to do so. */
         hdr->replen = RPC_REPHDRSIZE + auth->au_rslack + 3 + hdr->taglen;
  
-       dprintk("encode_compound: tag=%.*s\n", (int)hdr->taglen, hdr->tag);
         BUG_ON(hdr->taglen > NFS4_MAXTAGLEN);
-       p = reserve_space(xdr, 4 + hdr->taglen + 8);
-       p = xdr_encode_opaque(p, hdr->tag, hdr->taglen);
+       encode_string(xdr, hdr->taglen, hdr->tag);
+       p = reserve_space(xdr, 8);
         *p++ = cpu_to_be32(hdr->minorversion);
         hdr->nops_p = p;
         *p = cpu_to_be32(hdr->nops);
  }
  
+static void encode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 op,
+               uint32_t replen,
+               struct compound_hdr *hdr)
+{
+       encode_uint32(xdr, op);
+       hdr->nops++;
+       hdr->replen += replen;
+}
+
  static void encode_nops(struct compound_hdr *hdr)
  {
         BUG_ON(hdr->nops > NFS4_MAX_OPS);
         *hdr->nops_p = htonl(hdr->nops);
  }
  
-static void encode_nfs4_verifier(struct xdr_stream *xdr, const nfs4_verifier *verf)
+static void encode_nfs4_stateid(struct xdr_stream *xdr, const nfs4_stateid *stateid)
  {
-       __be32 *p;
+       encode_opaque_fixed(xdr, stateid, NFS4_STATEID_SIZE);
+}
  
-       p = xdr_reserve_space(xdr, NFS4_VERIFIER_SIZE);
-       BUG_ON(p == NULL);
-       xdr_encode_opaque_fixed(p, verf->data, NFS4_VERIFIER_SIZE);
+static void encode_nfs4_verifier(struct xdr_stream *xdr, const nfs4_verifier *verf)
+{
+       encode_opaque_fixed(xdr, verf->data, NFS4_VERIFIER_SIZE);
  }
  
  static void encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const struct nfs_server *server)
@@ -1023,7 +1078,7 @@ static void encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const
          * Now we backfill the bitmap and the attribute buffer length.
          */
         if (len != ((char *)p - (char *)q) + 4) {
-               printk(KERN_ERR "nfs: Attr length error, %u != %Zu\n",
+               printk(KERN_ERR "NFS: Attr length error, %u != %Zu\n",
                                 len, ((char *)p - (char *)q) + 4);
                 BUG();
         }
@@ -1037,46 +1092,33 @@ static void encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const
  
  static void encode_access(struct xdr_stream *xdr, u32 access, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8);
-       *p++ = cpu_to_be32(OP_ACCESS);
-       *p = cpu_to_be32(access);
-       hdr->nops++;
-       hdr->replen += decode_access_maxsz;
+       encode_op_hdr(xdr, OP_ACCESS, decode_access_maxsz, hdr);
+       encode_uint32(xdr, access);
  }
  
  static void encode_close(struct xdr_stream *xdr, const struct nfs_closeargs *arg, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8+NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_CLOSE);
-       *p++ = cpu_to_be32(arg->seqid->sequence->counter);
-       xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_close_maxsz;
+       encode_op_hdr(xdr, OP_CLOSE, decode_close_maxsz, hdr);
+       encode_nfs4_seqid(xdr, arg->seqid);
+       encode_nfs4_stateid(xdr, arg->stateid);
  }
  
  static void encode_commit(struct xdr_stream *xdr, const struct nfs_writeargs *args, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 16);
-       *p++ = cpu_to_be32(OP_COMMIT);
+       encode_op_hdr(xdr, OP_COMMIT, decode_commit_maxsz, hdr);
+       p = reserve_space(xdr, 12);
         p = xdr_encode_hyper(p, args->offset);
         *p = cpu_to_be32(args->count);
-       hdr->nops++;
-       hdr->replen += decode_commit_maxsz;
  }
  
  static void encode_create(struct xdr_stream *xdr, const struct nfs4_create_arg *create, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 8);
-       *p++ = cpu_to_be32(OP_CREATE);
-       *p = cpu_to_be32(create->ftype);
+       encode_op_hdr(xdr, OP_CREATE, decode_create_maxsz, hdr);
+       encode_uint32(xdr, create->ftype);
  
         switch (create->ftype) {
         case NF4LNK:
@@ -1096,9 +1138,6 @@ static void encode_create(struct xdr_stream *xdr, const struct nfs4_create_arg *
         }
  
         encode_string(xdr, create->name->len, create->name->name);
-       hdr->nops++;
-       hdr->replen += decode_create_maxsz;
-
         encode_attrs(xdr, create->attrs, create->server);
  }
  
@@ -1106,25 +1145,21 @@ static void encode_getattr_one(struct xdr_stream *xdr, uint32_t bitmap, struct c
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 12);
-       *p++ = cpu_to_be32(OP_GETATTR);
+       encode_op_hdr(xdr, OP_GETATTR, decode_getattr_maxsz, hdr);
+       p = reserve_space(xdr, 8);
         *p++ = cpu_to_be32(1);
         *p = cpu_to_be32(bitmap);
-       hdr->nops++;
-       hdr->replen += decode_getattr_maxsz;
  }
  
  static void encode_getattr_two(struct xdr_stream *xdr, uint32_t bm0, uint32_t bm1, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 16);
-       *p++ = cpu_to_be32(OP_GETATTR);
+       encode_op_hdr(xdr, OP_GETATTR, decode_getattr_maxsz, hdr);
+       p = reserve_space(xdr, 12);
         *p++ = cpu_to_be32(2);
         *p++ = cpu_to_be32(bm0);
         *p = cpu_to_be32(bm1);
-       hdr->nops++;
-       hdr->replen += decode_getattr_maxsz;
  }
  
  static void
@@ -1134,8 +1169,7 @@ encode_getattr_three(struct xdr_stream *xdr,
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_GETATTR);
+       encode_op_hdr(xdr, OP_GETATTR, decode_getattr_maxsz, hdr);
         if (bm2) {
                 p = reserve_space(xdr, 16);
                 *p++ = cpu_to_be32(3);
@@ -1152,8 +1186,6 @@ encode_getattr_three(struct xdr_stream *xdr,
                 *p++ = cpu_to_be32(1);
                 *p = cpu_to_be32(bm0);
         }
-       hdr->nops++;
-       hdr->replen += decode_getattr_maxsz;
  }
  
  static void encode_getfattr(struct xdr_stream *xdr, const u32* bitmask, struct compound_hdr *hdr)
@@ -1179,23 +1211,13 @@ static void encode_fs_locations(struct xdr_stream *xdr, const u32* bitmask, stru
  
  static void encode_getfh(struct xdr_stream *xdr, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_GETFH);
-       hdr->nops++;
-       hdr->replen += decode_getfh_maxsz;
+       encode_op_hdr(xdr, OP_GETFH, decode_getfh_maxsz, hdr);
  }
  
  static void encode_link(struct xdr_stream *xdr, const struct qstr *name, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + name->len);
-       *p++ = cpu_to_be32(OP_LINK);
-       xdr_encode_opaque(p, name->name, name->len);
-       hdr->nops++;
-       hdr->replen += decode_link_maxsz;
+       encode_op_hdr(xdr, OP_LINK, decode_link_maxsz, hdr);
+       encode_string(xdr, name->len, name->name);
  }
  
  static inline int nfs4_lock_type(struct file_lock *fl, int block)
@@ -1232,79 +1254,60 @@ static void encode_lock(struct xdr_stream *xdr, const struct nfs_lock_args *args
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 32);
-       *p++ = cpu_to_be32(OP_LOCK);
+       encode_op_hdr(xdr, OP_LOCK, decode_lock_maxsz, hdr);
+       p = reserve_space(xdr, 28);
         *p++ = cpu_to_be32(nfs4_lock_type(args->fl, args->block));
         *p++ = cpu_to_be32(args->reclaim);
         p = xdr_encode_hyper(p, args->fl->fl_start);
         p = xdr_encode_hyper(p, nfs4_lock_length(args->fl));
         *p = cpu_to_be32(args->new_lock_owner);
         if (args->new_lock_owner){
-               p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+4);
-               *p++ = cpu_to_be32(args->open_seqid->sequence->counter);
-               p = xdr_encode_opaque_fixed(p, args->open_stateid->data, NFS4_STATEID_SIZE);
-               *p++ = cpu_to_be32(args->lock_seqid->sequence->counter);
+               encode_nfs4_seqid(xdr, args->open_seqid);
+               encode_nfs4_stateid(xdr, args->open_stateid);
+               encode_nfs4_seqid(xdr, args->lock_seqid);
                 encode_lockowner(xdr, &args->lock_owner);
         }
         else {
-               p = reserve_space(xdr, NFS4_STATEID_SIZE+4);
-               p = xdr_encode_opaque_fixed(p, args->lock_stateid->data, NFS4_STATEID_SIZE);
-               *p = cpu_to_be32(args->lock_seqid->sequence->counter);
+               encode_nfs4_stateid(xdr, args->lock_stateid);
+               encode_nfs4_seqid(xdr, args->lock_seqid);
         }
-       hdr->nops++;
-       hdr->replen += decode_lock_maxsz;
  }
  
  static void encode_lockt(struct xdr_stream *xdr, const struct nfs_lockt_args *args, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 24);
-       *p++ = cpu_to_be32(OP_LOCKT);
+       encode_op_hdr(xdr, OP_LOCKT, decode_lockt_maxsz, hdr);
+       p = reserve_space(xdr, 20);
         *p++ = cpu_to_be32(nfs4_lock_type(args->fl, 0));
         p = xdr_encode_hyper(p, args->fl->fl_start);
         p = xdr_encode_hyper(p, nfs4_lock_length(args->fl));
         encode_lockowner(xdr, &args->lock_owner);
-       hdr->nops++;
-       hdr->replen += decode_lockt_maxsz;
  }
  
  static void encode_locku(struct xdr_stream *xdr, const struct nfs_locku_args *args, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 12+NFS4_STATEID_SIZE+16);
-       *p++ = cpu_to_be32(OP_LOCKU);
-       *p++ = cpu_to_be32(nfs4_lock_type(args->fl, 0));
-       *p++ = cpu_to_be32(args->seqid->sequence->counter);
-       p = xdr_encode_opaque_fixed(p, args->stateid->data, NFS4_STATEID_SIZE);
+       encode_op_hdr(xdr, OP_LOCKU, decode_locku_maxsz, hdr);
+       encode_uint32(xdr, nfs4_lock_type(args->fl, 0));
+       encode_nfs4_seqid(xdr, args->seqid);
+       encode_nfs4_stateid(xdr, args->stateid);
+       p = reserve_space(xdr, 16);
         p = xdr_encode_hyper(p, args->fl->fl_start);
         xdr_encode_hyper(p, nfs4_lock_length(args->fl));
-       hdr->nops++;
-       hdr->replen += decode_locku_maxsz;
  }
  
  static void encode_release_lockowner(struct xdr_stream *xdr, const struct nfs_lowner *lowner, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_RELEASE_LOCKOWNER);
+       encode_op_hdr(xdr, OP_RELEASE_LOCKOWNER, decode_release_lockowner_maxsz, hdr);
         encode_lockowner(xdr, lowner);
-       hdr->nops++;
-       hdr->replen += decode_release_lockowner_maxsz;
  }
  
  static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struct compound_hdr *hdr)
  {
-       int len = name->len;
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + len);
-       *p++ = cpu_to_be32(OP_LOOKUP);
-       xdr_encode_opaque(p, name->name, len);
-       hdr->nops++;
-       hdr->replen += decode_lookup_maxsz;
+       encode_op_hdr(xdr, OP_LOOKUP, decode_lookup_maxsz, hdr);
+       encode_string(xdr, name->len, name->name);
  }
  
  static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
@@ -1335,9 +1338,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
   * opcode 4, seqid 4, share_access 4, share_deny 4, clientid 8, ownerlen 4,
   * owner 4 = 32
   */
-       p = reserve_space(xdr, 8);
-       *p++ = cpu_to_be32(OP_OPEN);
-       *p = cpu_to_be32(arg->seqid->sequence->counter);
+       encode_nfs4_seqid(xdr, arg->seqid);
         encode_share_access(xdr, arg->fmode);
         p = reserve_space(xdr, 32);
         p = xdr_encode_hyper(p, arg->clientid);
@@ -1437,14 +1438,15 @@ static inline void encode_claim_delegate_cur(struct xdr_stream *xdr, const struc
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(NFS4_OPEN_CLAIM_DELEGATE_CUR);
-       xdr_encode_opaque_fixed(p, stateid->data, NFS4_STATEID_SIZE);
+       p = reserve_space(xdr, 4);
+       *p = cpu_to_be32(NFS4_OPEN_CLAIM_DELEGATE_CUR);
+       encode_nfs4_stateid(xdr, stateid);
         encode_string(xdr, name->len, name->name);
  }
  
  static void encode_open(struct xdr_stream *xdr, const struct nfs_openargs *arg, struct compound_hdr *hdr)
  {
+       encode_op_hdr(xdr, OP_OPEN, decode_open_maxsz, hdr);
         encode_openhdr(xdr, arg);
         encode_opentype(xdr, arg);
         switch (arg->claim) {
@@ -1460,88 +1462,64 @@ static void encode_open(struct xdr_stream *xdr, const struct nfs_openargs *arg,
         default:
                 BUG();
         }
-       hdr->nops++;
-       hdr->replen += decode_open_maxsz;
  }
  
  static void encode_open_confirm(struct xdr_stream *xdr, const struct nfs_open_confirmargs *arg, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+4);
-       *p++ = cpu_to_be32(OP_OPEN_CONFIRM);
-       p = xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE);
-       *p = cpu_to_be32(arg->seqid->sequence->counter);
-       hdr->nops++;
-       hdr->replen += decode_open_confirm_maxsz;
+       encode_op_hdr(xdr, OP_OPEN_CONFIRM, decode_open_confirm_maxsz, hdr);
+       encode_nfs4_stateid(xdr, arg->stateid);
+       encode_nfs4_seqid(xdr, arg->seqid);
  }
  
  static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_closeargs *arg, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE+4);
-       *p++ = cpu_to_be32(OP_OPEN_DOWNGRADE);
-       p = xdr_encode_opaque_fixed(p, arg->stateid->data, NFS4_STATEID_SIZE);
-       *p = cpu_to_be32(arg->seqid->sequence->counter);
+       encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
+       encode_nfs4_stateid(xdr, arg->stateid);
+       encode_nfs4_seqid(xdr, arg->seqid);
         encode_share_access(xdr, arg->fmode);
-       hdr->nops++;
-       hdr->replen += decode_open_downgrade_maxsz;
  }
  
  static void
  encode_putfh(struct xdr_stream *xdr, const struct nfs_fh *fh, struct compound_hdr *hdr)
  {
-       int len = fh->size;
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + len);
-       *p++ = cpu_to_be32(OP_PUTFH);
-       xdr_encode_opaque(p, fh->data, len);
-       hdr->nops++;
-       hdr->replen += decode_putfh_maxsz;
+       encode_op_hdr(xdr, OP_PUTFH, decode_putfh_maxsz, hdr);
+       encode_string(xdr, fh->size, fh->data);
  }
  
  static void encode_putrootfh(struct xdr_stream *xdr, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_PUTROOTFH);
-       hdr->nops++;
-       hdr->replen += decode_putrootfh_maxsz;
+       encode_op_hdr(xdr, OP_PUTROOTFH, decode_putrootfh_maxsz, hdr);
  }
  
-static void encode_stateid(struct xdr_stream *xdr, const struct nfs_open_context *ctx, const struct nfs_lock_context *l_ctx, int zero_seqid)
+static void encode_open_stateid(struct xdr_stream *xdr,
+               const struct nfs_open_context *ctx,
+               const struct nfs_lock_context *l_ctx,
+               fmode_t fmode,
+               int zero_seqid)
  {
         nfs4_stateid stateid;
-       __be32 *p;
  
-       p = reserve_space(xdr, NFS4_STATEID_SIZE);
         if (ctx->state != NULL) {
-               nfs4_copy_stateid(&stateid, ctx->state, l_ctx->lockowner, l_ctx->pid);
+               nfs4_select_rw_stateid(&stateid, ctx->state,
+                               fmode, l_ctx->lockowner, l_ctx->pid);
                 if (zero_seqid)
-                       stateid.stateid.seqid = 0;
-               xdr_encode_opaque_fixed(p, stateid.data, NFS4_STATEID_SIZE);
+                       stateid.seqid = 0;
+               encode_nfs4_stateid(xdr, &stateid);
         } else
-               xdr_encode_opaque_fixed(p, zero_stateid.data, NFS4_STATEID_SIZE);
+               encode_nfs4_stateid(xdr, &zero_stateid);
  }
  
  static void encode_read(struct xdr_stream *xdr, const struct nfs_readargs *args, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_READ);
-
-       encode_stateid(xdr, args->context, args->lock_context,
-                      hdr->minorversion);
+       encode_op_hdr(xdr, OP_READ, decode_read_maxsz, hdr);
+       encode_open_stateid(xdr, args->context, args->lock_context,
+                       FMODE_READ, hdr->minorversion);
  
         p = reserve_space(xdr, 12);
         p = xdr_encode_hyper(p, args->offset);
         *p = cpu_to_be32(args->count);
-       hdr->nops++;
-       hdr->replen += decode_read_maxsz;
  }
  
  static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg *readdir, struct rpc_rqst *req, struct compound_hdr *hdr)
@@ -1551,7 +1529,7 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
                 FATTR4_WORD1_MOUNTED_ON_FILEID,
         };
         uint32_t dircount = readdir->count >> 1;
-       __be32 *p;
+       __be32 *p, verf[2];
  
         if (readdir->plus) {
                 attrs[0] |= FATTR4_WORD0_TYPE|FATTR4_WORD0_CHANGE|FATTR4_WORD0_SIZE|
@@ -1566,80 +1544,54 @@ static void encode_readdir(struct xdr_stream *xdr, const struct nfs4_readdir_arg
         if (!(readdir->bitmask[1] & FATTR4_WORD1_MOUNTED_ON_FILEID))
                 attrs[0] |= FATTR4_WORD0_FILEID;
  
-       p = reserve_space(xdr, 12+NFS4_VERIFIER_SIZE+20);
-       *p++ = cpu_to_be32(OP_READDIR);
-       p = xdr_encode_hyper(p, readdir->cookie);
-       p = xdr_encode_opaque_fixed(p, readdir->verifier.data, NFS4_VERIFIER_SIZE);
+       encode_op_hdr(xdr, OP_READDIR, decode_readdir_maxsz, hdr);
+       encode_uint64(xdr, readdir->cookie);
+       encode_nfs4_verifier(xdr, &readdir->verifier);
+       p = reserve_space(xdr, 20);
         *p++ = cpu_to_be32(dircount);
         *p++ = cpu_to_be32(readdir->count);
         *p++ = cpu_to_be32(2);
  
         *p++ = cpu_to_be32(attrs[0] & readdir->bitmask[0]);
         *p = cpu_to_be32(attrs[1] & readdir->bitmask[1]);
-       hdr->nops++;
-       hdr->replen += decode_readdir_maxsz;
+       memcpy(verf, readdir->verifier.data, sizeof(verf));
         dprintk("%s: cookie = %Lu, verifier = %08x:%08x, bitmap = %08x:%08x\n",
                         __func__,
                         (unsigned long long)readdir->cookie,
-                       ((u32 *)readdir->verifier.data)[0],
-                       ((u32 *)readdir->verifier.data)[1],
+                       verf[0], verf[1],
                         attrs[0] & readdir->bitmask[0],
                         attrs[1] & readdir->bitmask[1]);
  }
  
  static void encode_readlink(struct xdr_stream *xdr, const struct nfs4_readlink *readlink, struct rpc_rqst *req, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_READLINK);
-       hdr->nops++;
-       hdr->replen += decode_readlink_maxsz;
+       encode_op_hdr(xdr, OP_READLINK, decode_readlink_maxsz, hdr);
  }
  
  static void encode_remove(struct xdr_stream *xdr, const struct qstr *name, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + name->len);
-       *p++ = cpu_to_be32(OP_REMOVE);
-       xdr_encode_opaque(p, name->name, name->len);
-       hdr->nops++;
-       hdr->replen += decode_remove_maxsz;
+       encode_op_hdr(xdr, OP_REMOVE, decode_remove_maxsz, hdr);
+       encode_string(xdr, name->len, name->name);
  }
  
  static void encode_rename(struct xdr_stream *xdr, const struct qstr *oldname, const struct qstr *newname, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_RENAME);
+       encode_op_hdr(xdr, OP_RENAME, decode_rename_maxsz, hdr);
         encode_string(xdr, oldname->len, oldname->name);
         encode_string(xdr, newname->len, newname->name);
-       hdr->nops++;
-       hdr->replen += decode_rename_maxsz;
  }
  
-static void encode_renew(struct xdr_stream *xdr, const struct nfs_client *client_stateid, struct compound_hdr *hdr)
+static void encode_renew(struct xdr_stream *xdr, clientid4 clid,
+                        struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 12);
-       *p++ = cpu_to_be32(OP_RENEW);
-       xdr_encode_hyper(p, client_stateid->cl_clientid);
-       hdr->nops++;
-       hdr->replen += decode_renew_maxsz;
+       encode_op_hdr(xdr, OP_RENEW, decode_renew_maxsz, hdr);
+       encode_uint64(xdr, clid);
  }
  
  static void
  encode_restorefh(struct xdr_stream *xdr, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_RESTOREFH);
-       hdr->nops++;
-       hdr->replen += decode_restorefh_maxsz;
+       encode_op_hdr(xdr, OP_RESTOREFH, decode_restorefh_maxsz, hdr);
  }
  
  static void
@@ -1647,9 +1599,8 @@ encode_setacl(struct xdr_stream *xdr, struct nfs_setaclargs *arg, struct compoun
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_SETATTR);
-       xdr_encode_opaque_fixed(p, zero_stateid.data, NFS4_STATEID_SIZE);
+       encode_op_hdr(xdr, OP_SETATTR, decode_setacl_maxsz, hdr);
+       encode_nfs4_stateid(xdr, &zero_stateid);
         p = reserve_space(xdr, 2*4);
         *p++ = cpu_to_be32(1);
         *p = cpu_to_be32(FATTR4_WORD0_ACL);
@@ -1657,30 +1608,18 @@ encode_setacl(struct xdr_stream *xdr, struct nfs_setaclargs *arg, struct compoun
         p = reserve_space(xdr, 4);
         *p = cpu_to_be32(arg->acl_len);
         xdr_write_pages(xdr, arg->acl_pages, arg->acl_pgbase, arg->acl_len);
-       hdr->nops++;
-       hdr->replen += decode_setacl_maxsz;
  }
  
  static void
  encode_savefh(struct xdr_stream *xdr, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_SAVEFH);
-       hdr->nops++;
-       hdr->replen += decode_savefh_maxsz;
+       encode_op_hdr(xdr, OP_SAVEFH, decode_savefh_maxsz, hdr);
  }
  
  static void encode_setattr(struct xdr_stream *xdr, const struct nfs_setattrargs *arg, const struct nfs_server *server, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_SETATTR);
-       xdr_encode_opaque_fixed(p, arg->stateid.data, NFS4_STATEID_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_setattr_maxsz;
+       encode_op_hdr(xdr, OP_SETATTR, decode_setattr_maxsz, hdr);
+       encode_nfs4_stateid(xdr, &arg->stateid);
         encode_attrs(xdr, arg->iap, server);
  }
  
@@ -1688,9 +1627,8 @@ static void encode_setclientid(struct xdr_stream *xdr, const struct nfs4_setclie
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4 + NFS4_VERIFIER_SIZE);
-       *p++ = cpu_to_be32(OP_SETCLIENTID);
-       xdr_encode_opaque_fixed(p, setclientid->sc_verifier->data, NFS4_VERIFIER_SIZE);
+       encode_op_hdr(xdr, OP_SETCLIENTID, decode_setclientid_maxsz, hdr);
+       encode_nfs4_verifier(xdr, setclientid->sc_verifier);
  
         encode_string(xdr, setclientid->sc_name_len, setclientid->sc_name);
         p = reserve_space(xdr, 4);
@@ -1699,31 +1637,23 @@ static void encode_setclientid(struct xdr_stream *xdr, const struct nfs4_setclie
         encode_string(xdr, setclientid->sc_uaddr_len, setclientid->sc_uaddr);
         p = reserve_space(xdr, 4);
         *p = cpu_to_be32(setclientid->sc_cb_ident);
-       hdr->nops++;
-       hdr->replen += decode_setclientid_maxsz;
  }
  
  static void encode_setclientid_confirm(struct xdr_stream *xdr, const struct nfs4_setclientid_res *arg, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 12 + NFS4_VERIFIER_SIZE);
-       *p++ = cpu_to_be32(OP_SETCLIENTID_CONFIRM);
-       p = xdr_encode_hyper(p, arg->clientid);
-       xdr_encode_opaque_fixed(p, arg->confirm.data, NFS4_VERIFIER_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_setclientid_confirm_maxsz;
+       encode_op_hdr(xdr, OP_SETCLIENTID_CONFIRM,
+                       decode_setclientid_confirm_maxsz, hdr);
+       encode_uint64(xdr, arg->clientid);
+       encode_nfs4_verifier(xdr, &arg->confirm);
  }
  
  static void encode_write(struct xdr_stream *xdr, const struct nfs_writeargs *args, struct compound_hdr *hdr)
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 4);
-       *p = cpu_to_be32(OP_WRITE);
-
-       encode_stateid(xdr, args->context, args->lock_context,
-                      hdr->minorversion);
+       encode_op_hdr(xdr, OP_WRITE, decode_write_maxsz, hdr);
+       encode_open_stateid(xdr, args->context, args->lock_context,
+                       FMODE_WRITE, hdr->minorversion);
  
         p = reserve_space(xdr, 16);
         p = xdr_encode_hyper(p, args->offset);
@@ -1731,32 +1661,18 @@ static void encode_write(struct xdr_stream *xdr, const struct nfs_writeargs *arg
         *p = cpu_to_be32(args->count);
  
         xdr_write_pages(xdr, args->pages, args->pgbase, args->count);
-       hdr->nops++;
-       hdr->replen += decode_write_maxsz;
  }
  
  static void encode_delegreturn(struct xdr_stream *xdr, const nfs4_stateid *stateid, struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 4+NFS4_STATEID_SIZE);
-
-       *p++ = cpu_to_be32(OP_DELEGRETURN);
-       xdr_encode_opaque_fixed(p, stateid->data, NFS4_STATEID_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_delegreturn_maxsz;
+       encode_op_hdr(xdr, OP_DELEGRETURN, decode_delegreturn_maxsz, hdr);
+       encode_nfs4_stateid(xdr, stateid);
  }
  
  static void encode_secinfo(struct xdr_stream *xdr, const struct qstr *name, struct compound_hdr *hdr)
  {
-       int len = name->len;
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + len);
-       *p++ = cpu_to_be32(OP_SECINFO);
-       xdr_encode_opaque(p, name->name, len);
-       hdr->nops++;
-       hdr->replen += decode_secinfo_maxsz;
+       encode_op_hdr(xdr, OP_SECINFO, decode_secinfo_maxsz, hdr);
+       encode_string(xdr, name->len, name->name);
  }
  
  #if defined(CONFIG_NFS_V4_1)
@@ -1766,19 +1682,39 @@ static void encode_exchange_id(struct xdr_stream *xdr,
                                struct compound_hdr *hdr)
  {
         __be32 *p;
+       char impl_name[NFS4_OPAQUE_LIMIT];
+       int len = 0;
  
-       p = reserve_space(xdr, 4 + sizeof(args->verifier->data));
-       *p++ = cpu_to_be32(OP_EXCHANGE_ID);
-       xdr_encode_opaque_fixed(p, args->verifier->data, sizeof(args->verifier->data));
+       encode_op_hdr(xdr, OP_EXCHANGE_ID, decode_exchange_id_maxsz, hdr);
+       encode_nfs4_verifier(xdr, args->verifier);
  
         encode_string(xdr, args->id_len, args->id);
  
         p = reserve_space(xdr, 12);
         *p++ = cpu_to_be32(args->flags);
         *p++ = cpu_to_be32(0);  /* zero length state_protect4_a */
-       *p = cpu_to_be32(0);    /* zero length implementation id array */
-       hdr->nops++;
-       hdr->replen += decode_exchange_id_maxsz;
+
+       if (send_implementation_id &&
+           sizeof(CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN) > 1 &&
+           sizeof(CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN)
+               <= NFS4_OPAQUE_LIMIT + 1)
+               len = snprintf(impl_name, sizeof(impl_name), "%s %s %s %s",
+                              utsname()->sysname, utsname()->release,
+                              utsname()->version, utsname()->machine);
+
+       if (len > 0) {
+               *p = cpu_to_be32(1);    /* implementation id array length=1 */
+
+               encode_string(xdr,
+                       sizeof(CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN) - 1,
+                       CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN);
+               encode_string(xdr, len, impl_name);
+               /* just send zeros for nii_date - the date is in nii_name */
+               p = reserve_space(xdr, 12);
+               p = xdr_encode_hyper(p, 0);
+               *p = cpu_to_be32(0);
+       } else
+               *p = cpu_to_be32(0);    /* implementation id array length=0 */
  }
  
  static void encode_create_session(struct xdr_stream *xdr,
@@ -1801,8 +1737,8 @@ static void encode_create_session(struct xdr_stream *xdr,
         len = scnprintf(machine_name, sizeof(machine_name), "%s",
                         clp->cl_ipaddr);
  
-       p = reserve_space(xdr, 20 + 2*28 + 20 + len + 12);
-       *p++ = cpu_to_be32(OP_CREATE_SESSION);
+       encode_op_hdr(xdr, OP_CREATE_SESSION, decode_create_session_maxsz, hdr);
+       p = reserve_space(xdr, 16 + 2*28 + 20 + len + 12);
         p = xdr_encode_hyper(p, clp->cl_clientid);
         *p++ = cpu_to_be32(clp->cl_seqid);                      /*Sequence id */
         *p++ = cpu_to_be32(args->flags);                        /*flags */
@@ -1835,33 +1771,22 @@ static void encode_create_session(struct xdr_stream *xdr,
         *p++ = cpu_to_be32(0);                          /* UID */
         *p++ = cpu_to_be32(0);                          /* GID */
         *p = cpu_to_be32(0);                            /* No more gids */
-       hdr->nops++;
-       hdr->replen += decode_create_session_maxsz;
  }
  
  static void encode_destroy_session(struct xdr_stream *xdr,
                                    struct nfs4_session *session,
                                    struct compound_hdr *hdr)
  {
-       __be32 *p;
-       p = reserve_space(xdr, 4 + NFS4_MAX_SESSIONID_LEN);
-       *p++ = cpu_to_be32(OP_DESTROY_SESSION);
-       xdr_encode_opaque_fixed(p, session->sess_id.data, NFS4_MAX_SESSIONID_LEN);
-       hdr->nops++;
-       hdr->replen += decode_destroy_session_maxsz;
+       encode_op_hdr(xdr, OP_DESTROY_SESSION, decode_destroy_session_maxsz, hdr);
+       encode_opaque_fixed(xdr, session->sess_id.data, NFS4_MAX_SESSIONID_LEN);
  }
  
  static void encode_reclaim_complete(struct xdr_stream *xdr,
                                     struct nfs41_reclaim_complete_args *args,
                                     struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8);
-       *p++ = cpu_to_be32(OP_RECLAIM_COMPLETE);
-       *p++ = cpu_to_be32(args->one_fs);
-       hdr->nops++;
-       hdr->replen += decode_reclaim_complete_maxsz;
+       encode_op_hdr(xdr, OP_RECLAIM_COMPLETE, decode_reclaim_complete_maxsz, hdr);
+       encode_uint32(xdr, args->one_fs);
  }
  #endif /* CONFIG_NFS_V4_1 */
  
@@ -1883,8 +1808,7 @@ static void encode_sequence(struct xdr_stream *xdr,
         WARN_ON(args->sa_slotid == NFS4_MAX_SLOT_TABLE);
         slot = tp->slots + args->sa_slotid;
  
-       p = reserve_space(xdr, 4 + NFS4_MAX_SESSIONID_LEN + 16);
-       *p++ = cpu_to_be32(OP_SEQUENCE);
+       encode_op_hdr(xdr, OP_SEQUENCE, decode_sequence_maxsz, hdr);
  
         /*
          * Sessionid + seqid + slotid + max slotid + cache_this
@@ -1898,13 +1822,12 @@ static void encode_sequence(struct xdr_stream *xdr,
                 ((u32 *)session->sess_id.data)[3],
                 slot->seq_nr, args->sa_slotid,
                 tp->highest_used_slotid, args->sa_cache_this);
+       p = reserve_space(xdr, NFS4_MAX_SESSIONID_LEN + 16);
         p = xdr_encode_opaque_fixed(p, session->sess_id.data, NFS4_MAX_SESSIONID_LEN);
         *p++ = cpu_to_be32(slot->seq_nr);
         *p++ = cpu_to_be32(args->sa_slotid);
         *p++ = cpu_to_be32(tp->highest_used_slotid);
         *p = cpu_to_be32(args->sa_cache_this);
-       hdr->nops++;
-       hdr->replen += decode_sequence_maxsz;
  #endif /* CONFIG_NFS_V4_1 */
  }
  
@@ -1919,14 +1842,12 @@ encode_getdevicelist(struct xdr_stream *xdr,
                 .data = "dummmmmy",
         };
  
-       p = reserve_space(xdr, 20);
-       *p++ = cpu_to_be32(OP_GETDEVICELIST);
+       encode_op_hdr(xdr, OP_GETDEVICELIST, decode_getdevicelist_maxsz, hdr);
+       p = reserve_space(xdr, 16);
         *p++ = cpu_to_be32(args->layoutclass);
         *p++ = cpu_to_be32(NFS4_PNFS_GETDEVLIST_MAXNUM);
         xdr_encode_hyper(p, 0ULL);                          /* cookie */
         encode_nfs4_verifier(xdr, &dummy);
-       hdr->nops++;
-       hdr->replen += decode_getdevicelist_maxsz;
  }
  
  static void
@@ -1936,15 +1857,13 @@ encode_getdeviceinfo(struct xdr_stream *xdr,
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 16 + NFS4_DEVICEID4_SIZE);
-       *p++ = cpu_to_be32(OP_GETDEVICEINFO);
+       encode_op_hdr(xdr, OP_GETDEVICEINFO, decode_getdeviceinfo_maxsz, hdr);
+       p = reserve_space(xdr, 12 + NFS4_DEVICEID4_SIZE);
         p = xdr_encode_opaque_fixed(p, args->pdev->dev_id.data,
                                     NFS4_DEVICEID4_SIZE);
         *p++ = cpu_to_be32(args->pdev->layout_type);
         *p++ = cpu_to_be32(args->pdev->pglen);          /* gdia_maxcount */
         *p++ = cpu_to_be32(0);                          /* bitmap length 0 */
-       hdr->nops++;
-       hdr->replen += decode_getdeviceinfo_maxsz;
  }
  
  static void
@@ -1954,16 +1873,16 @@ encode_layoutget(struct xdr_stream *xdr,
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 44 + NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_LAYOUTGET);
+       encode_op_hdr(xdr, OP_LAYOUTGET, decode_layoutget_maxsz, hdr);
+       p = reserve_space(xdr, 36);
         *p++ = cpu_to_be32(0);     /* Signal layout available */
         *p++ = cpu_to_be32(args->type);
         *p++ = cpu_to_be32(args->range.iomode);
         p = xdr_encode_hyper(p, args->range.offset);
         p = xdr_encode_hyper(p, args->range.length);
         p = xdr_encode_hyper(p, args->minlength);
-       p = xdr_encode_opaque_fixed(p, &args->stateid.data, NFS4_STATEID_SIZE);
-       *p = cpu_to_be32(args->maxcount);
+       encode_nfs4_stateid(xdr, &args->stateid);
+       encode_uint32(xdr, args->maxcount);
  
         dprintk("%s: 1st type:0x%x iomode:%d off:%lu len:%lu mc:%d\n",
                 __func__,
@@ -1972,8 +1891,6 @@ encode_layoutget(struct xdr_stream *xdr,
                 (unsigned long)args->range.offset,
                 (unsigned long)args->range.length,
                 args->maxcount);
-       hdr->nops++;
-       hdr->replen += decode_layoutget_maxsz;
  }
  
  static int
@@ -1987,13 +1904,14 @@ encode_layoutcommit(struct xdr_stream *xdr,
         dprintk("%s: lbw: %llu type: %d\n", __func__, args->lastbytewritten,
                 NFS_SERVER(args->inode)->pnfs_curr_ld->id);
  
-       p = reserve_space(xdr, 44 + NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_LAYOUTCOMMIT);
+       encode_op_hdr(xdr, OP_LAYOUTCOMMIT, decode_layoutcommit_maxsz, hdr);
+       p = reserve_space(xdr, 20);
         /* Only whole file layouts */
         p = xdr_encode_hyper(p, 0); /* offset */
         p = xdr_encode_hyper(p, args->lastbytewritten + 1);     /* length */
-       *p++ = cpu_to_be32(0); /* reclaim */
-       p = xdr_encode_opaque_fixed(p, args->stateid.data, NFS4_STATEID_SIZE);
+       *p = cpu_to_be32(0); /* reclaim */
+       encode_nfs4_stateid(xdr, &args->stateid);
+       p = reserve_space(xdr, 20);
         *p++ = cpu_to_be32(1); /* newoffset = TRUE */
         p = xdr_encode_hyper(p, args->lastbytewritten);
         *p++ = cpu_to_be32(0); /* Never send time_modify_changed */
@@ -2002,13 +1920,9 @@ encode_layoutcommit(struct xdr_stream *xdr,
         if (NFS_SERVER(inode)->pnfs_curr_ld->encode_layoutcommit)
                 NFS_SERVER(inode)->pnfs_curr_ld->encode_layoutcommit(
                         NFS_I(inode)->layout, xdr, args);
-       else {
-               p = reserve_space(xdr, 4);
-               *p = cpu_to_be32(0); /* no layout-type payload */
-       }
+       else
+               encode_uint32(xdr, 0); /* no layout-type payload */
  
-       hdr->nops++;
-       hdr->replen += decode_layoutcommit_maxsz;
         return 0;
  }
  
@@ -2019,27 +1933,23 @@ encode_layoutreturn(struct xdr_stream *xdr,
  {
         __be32 *p;
  
-       p = reserve_space(xdr, 20);
-       *p++ = cpu_to_be32(OP_LAYOUTRETURN);
+       encode_op_hdr(xdr, OP_LAYOUTRETURN, decode_layoutreturn_maxsz, hdr);
+       p = reserve_space(xdr, 16);
         *p++ = cpu_to_be32(0);          /* reclaim. always 0 for now */
         *p++ = cpu_to_be32(args->layout_type);
         *p++ = cpu_to_be32(IOMODE_ANY);
         *p = cpu_to_be32(RETURN_FILE);
-       p = reserve_space(xdr, 16 + NFS4_STATEID_SIZE);
+       p = reserve_space(xdr, 16);
         p = xdr_encode_hyper(p, 0);
         p = xdr_encode_hyper(p, NFS4_MAX_UINT64);
         spin_lock(&args->inode->i_lock);
-       xdr_encode_opaque_fixed(p, &args->stateid.data, NFS4_STATEID_SIZE);
+       encode_nfs4_stateid(xdr, &args->stateid);
         spin_unlock(&args->inode->i_lock);
         if (NFS_SERVER(args->inode)->pnfs_curr_ld->encode_layoutreturn) {
                 NFS_SERVER(args->inode)->pnfs_curr_ld->encode_layoutreturn(
                         NFS_I(args->inode)->layout, xdr, args);
-       } else {
-               p = reserve_space(xdr, 4);
-               *p = cpu_to_be32(0);
-       }
-       hdr->nops++;
-       hdr->replen += decode_layoutreturn_maxsz;
+       } else
+               encode_uint32(xdr, 0);
  }
  
  static int
@@ -2047,12 +1957,8 @@ encode_secinfo_no_name(struct xdr_stream *xdr,
                        const struct nfs41_secinfo_no_name_args *args,
                        struct compound_hdr *hdr)
  {
-       __be32 *p;
-       p = reserve_space(xdr, 8);
-       *p++ = cpu_to_be32(OP_SECINFO_NO_NAME);
-       *p++ = cpu_to_be32(args->style);
-       hdr->nops++;
-       hdr->replen += decode_secinfo_no_name_maxsz;
+       encode_op_hdr(xdr, OP_SECINFO_NO_NAME, decode_secinfo_no_name_maxsz, hdr);
+       encode_uint32(xdr, args->style);
         return 0;
  }
  
@@ -2060,26 +1966,17 @@ static void encode_test_stateid(struct xdr_stream *xdr,
                                 struct nfs41_test_stateid_args *args,
                                 struct compound_hdr *hdr)
  {
-       __be32 *p;
-
-       p = reserve_space(xdr, 8 + NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_TEST_STATEID);
-       *p++ = cpu_to_be32(1);
-       xdr_encode_opaque_fixed(p, args->stateid->data, NFS4_STATEID_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_test_stateid_maxsz;
+       encode_op_hdr(xdr, OP_TEST_STATEID, decode_test_stateid_maxsz, hdr);
+       encode_uint32(xdr, 1);
+       encode_nfs4_stateid(xdr, args->stateid);
  }
  
  static void encode_free_stateid(struct xdr_stream *xdr,
                                 struct nfs41_free_stateid_args *args,
                                 struct compound_hdr *hdr)
  {
-       __be32 *p;
-       p = reserve_space(xdr, 4 + NFS4_STATEID_SIZE);
-       *p++ = cpu_to_be32(OP_FREE_STATEID);
-       xdr_encode_opaque_fixed(p, args->stateid->data, NFS4_STATEID_SIZE);
-       hdr->nops++;
-       hdr->replen += decode_free_stateid_maxsz;
+       encode_op_hdr(xdr, OP_FREE_STATEID, decode_free_stateid_maxsz, hdr);
+       encode_nfs4_stateid(xdr, args->stateid);
  }
  #endif /* CONFIG_NFS_V4_1 */
  
@@ -2633,6 +2530,7 @@ static void nfs4_xdr_enc_server_caps(struct rpc_rqst *req,
         encode_sequence(xdr, &args->seq_args, &hdr);
         encode_putfh(xdr, args->fhandle, &hdr);
         encode_getattr_one(xdr, FATTR4_WORD0_SUPPORTED_ATTRS|
+                          FATTR4_WORD0_FH_EXPIRE_TYPE|
                            FATTR4_WORD0_LINK_SUPPORT|
                            FATTR4_WORD0_SYMLINK_SUPPORT|
                            FATTR4_WORD0_ACLSUPPORT, &hdr);
@@ -2650,7 +2548,7 @@ static void nfs4_xdr_enc_renew(struct rpc_rqst *req, struct xdr_stream *xdr,
         };
  
         encode_compound_hdr(xdr, req, &hdr);
-       encode_renew(xdr, clp, &hdr);
+       encode_renew(xdr, clp->cl_clientid, &hdr);
         encode_nops(&hdr);
  }
  
@@ -3180,6 +3078,28 @@ out_overflow:
         return -EIO;
  }
  
+static int decode_attr_fh_expire_type(struct xdr_stream *xdr,
+                                     uint32_t *bitmap, uint32_t *type)
+{
+       __be32 *p;
+
+       *type = 0;
+       if (unlikely(bitmap[0] & (FATTR4_WORD0_FH_EXPIRE_TYPE - 1U)))
+               return -EIO;
+       if (likely(bitmap[0] & FATTR4_WORD0_FH_EXPIRE_TYPE)) {
+               p = xdr_inline_decode(xdr, 4);
+               if (unlikely(!p))
+                       goto out_overflow;
+               *type = be32_to_cpup(p);
+               bitmap[0] &= ~FATTR4_WORD0_FH_EXPIRE_TYPE;
+       }
+       dprintk("%s: expire type=0x%x\n", __func__, *type);
+       return 0;
+out_overflow:
+       print_overflow_msg(__func__, xdr);
+       return -EIO;
+}
+
  static int decode_attr_change(struct xdr_stream *xdr, uint32_t *bitmap, uint64_t *change)
  {
         __be32 *p;
@@ -3513,16 +3433,17 @@ static int decode_pathname(struct xdr_stream *xdr, struct nfs4_pathname *path)
         n = be32_to_cpup(p);
         if (n == 0)
                 goto root_path;
-       dprintk("path ");
+       dprintk("pathname4: ");
         path->ncomponents = 0;
         while (path->ncomponents < n) {
                 struct nfs4_string *component = &path->components[path->ncomponents];
                 status = decode_opaque_inline(xdr, &component->len, &component->data);
                 if (unlikely(status != 0))
                         goto out_eio;
-               if (path->ncomponents != n)
-                       dprintk("/");
-               dprintk("%s", component->data);
+               ifdebug (XDR)
+                       pr_cont("%s%.*s ",
+                               (path->ncomponents != n ? "/ " : ""),
+                               component->len, component->data);
                 if (path->ncomponents < NFS4_PATHNAME_MAXCOMPONENTS)
                         path->ncomponents++;
                 else {
@@ -3531,14 +3452,13 @@ static int decode_pathname(struct xdr_stream *xdr, struct nfs4_pathname *path)
                 }
         }
  out:
-       dprintk("\n");
         return status;
  root_path:
  /* a root pathname is sent as a zero component4 */
         path->ncomponents = 1;
         path->components[0].len=0;
         path->components[0].data=NULL;
-       dprintk("path /\n");
+       dprintk("pathname4: /\n");
         goto out;
  out_eio:
         dprintk(" status %d", status);
@@ -3560,7 +3480,11 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
         status = 0;
         if (unlikely(!(bitmap[0] & FATTR4_WORD0_FS_LOCATIONS)))
                 goto out;
-       dprintk("%s: fsroot ", __func__);
+       status = -EIO;
+       /* Ignore borken servers that return unrequested attrs */
+       if (unlikely(res == NULL))
+               goto out;
+       dprintk("%s: fsroot:\n", __func__);
         status = decode_pathname(xdr, &res->fs_path);
         if (unlikely(status != 0))
                 goto out;
@@ -3581,7 +3505,7 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
                 m = be32_to_cpup(p);
  
                 loc->nservers = 0;
-               dprintk("%s: servers ", __func__);
+               dprintk("%s: servers:\n", __func__);
                 while (loc->nservers < m) {
                         struct nfs4_string *server = &loc->servers[loc->nservers];
                         status = decode_opaque_inline(xdr, &server->len, &server->data);
@@ -3613,7 +3537,7 @@ static int decode_attr_fs_locations(struct xdr_stream *xdr, uint32_t *bitmap, st
                         res->nlocations++;
         }
         if (res->nlocations != 0)
-               status = NFS_ATTR_FATTR_V4_REFERRAL;
+               status = NFS_ATTR_FATTR_V4_LOCATIONS;
  out:
         dprintk("%s: fs_locations done, error = %d\n", __func__, status);
         return status;
@@ -4157,7 +4081,7 @@ static int decode_opaque_fixed(struct xdr_stream *xdr, void *buf, size_t len)
  
  static int decode_stateid(struct xdr_stream *xdr, nfs4_stateid *stateid)
  {
-       return decode_opaque_fixed(xdr, stateid->data, NFS4_STATEID_SIZE);
+       return decode_opaque_fixed(xdr, stateid, NFS4_STATEID_SIZE);
  }
  
  static int decode_close(struct xdr_stream *xdr, struct nfs_closeres *res)
@@ -4174,7 +4098,7 @@ static int decode_close(struct xdr_stream *xdr, struct nfs_closeres *res)
  
  static int decode_verifier(struct xdr_stream *xdr, void *verifier)
  {
-       return decode_opaque_fixed(xdr, verifier, 8);
+       return decode_opaque_fixed(xdr, verifier, NFS4_VERIFIER_SIZE);
  }
  
  static int decode_commit(struct xdr_stream *xdr, struct nfs_writeres *res)
@@ -4224,6 +4148,9 @@ static int decode_server_caps(struct xdr_stream *xdr, struct nfs4_server_caps_re
                 goto xdr_error;
         if ((status = decode_attr_supported(xdr, bitmap, res->attr_bitmask)) != 0)
                 goto xdr_error;
+       if ((status = decode_attr_fh_expire_type(xdr, bitmap,
+                                                &res->fh_expire_type)) != 0)
+               goto xdr_error;
         if ((status = decode_attr_link_support(xdr, bitmap, &res->has_links)) != 0)
                 goto xdr_error;
         if ((status = decode_attr_symlink_support(xdr, bitmap, &res->has_symlinks)) != 0)
@@ -4294,6 +4221,7 @@ xdr_error:
  
  static int decode_getfattr_attrs(struct xdr_stream *xdr, uint32_t *bitmap,
                 struct nfs_fattr *fattr, struct nfs_fh *fh,
+               struct nfs4_fs_locations *fs_loc,
                 const struct nfs_server *server)
  {
         int status;
@@ -4341,9 +4269,7 @@ static int decode_getfattr_attrs(struct xdr_stream *xdr, uint32_t *bitmap,
                 goto xdr_error;
         fattr->valid |= status;
  
-       status = decode_attr_fs_locations(xdr, bitmap, container_of(fattr,
-                                               struct nfs4_fs_locations,
-                                               fattr));
+       status = decode_attr_fs_locations(xdr, bitmap, fs_loc);
         if (status < 0)
                 goto xdr_error;
         fattr->valid |= status;
@@ -4407,7 +4333,8 @@ xdr_error:
  }
  
  static int decode_getfattr_generic(struct xdr_stream *xdr, struct nfs_fattr *fattr,
-               struct nfs_fh *fh, const struct nfs_server *server)
+               struct nfs_fh *fh, struct nfs4_fs_locations *fs_loc,
+               const struct nfs_server *server)
  {
         __be32 *savep;
         uint32_t attrlen,
@@ -4426,7 +4353,7 @@ static int decode_getfattr_generic(struct xdr_stream *xdr, struct nfs_fattr *fat
         if (status < 0)
                 goto xdr_error;
  
-       status = decode_getfattr_attrs(xdr, bitmap, fattr, fh, server);
+       status = decode_getfattr_attrs(xdr, bitmap, fattr, fh, fs_loc, server);
         if (status < 0)
                 goto xdr_error;
  
@@ -4439,7 +4366,7 @@ xdr_error:
  static int decode_getfattr(struct xdr_stream *xdr, struct nfs_fattr *fattr,
                 const struct nfs_server *server)
  {
-       return decode_getfattr_generic(xdr, fattr, NULL, server);
+       return decode_getfattr_generic(xdr, fattr, NULL, NULL, server);
  }
  
  /*
@@ -4463,8 +4390,8 @@ static int decode_first_pnfs_layout_type(struct xdr_stream *xdr,
                 return 0;
         }
         if (num > 1)
-               printk(KERN_INFO "%s: Warning: Multiple pNFS layout drivers "
-                       "per filesystem not supported\n", __func__);
+               printk(KERN_INFO "NFS: %s: Warning: Multiple pNFS layout "
+                       "drivers per filesystem not supported\n", __func__);
  
         /* Decode and set first layout type, move xdr->p past unused types */
         p = xdr_inline_decode(xdr, num * 4);
@@ -4863,17 +4790,16 @@ static int decode_readdir(struct xdr_stream *xdr, struct rpc_rqst *req, struct n
         size_t          hdrlen;
         u32             recvd, pglen = rcvbuf->page_len;
         int             status;
+       __be32          verf[2];
  
         status = decode_op_hdr(xdr, OP_READDIR);
         if (!status)
                 status = decode_verifier(xdr, readdir->verifier.data);
         if (unlikely(status))
                 return status;
+       memcpy(verf, readdir->verifier.data, sizeof(verf));
         dprintk("%s: verifier = %08x:%08x\n",
-                       __func__,
-                       ((u32 *)readdir->verifier.data)[0],
-                       ((u32 *)readdir->verifier.data)[1]);
-
+                       __func__, verf[0], verf[1]);
  
         hdrlen = (char *) xdr->p - (char *) iov->iov_base;
         recvd = rcvbuf->len - hdrlen;
@@ -5120,7 +5046,7 @@ static int decode_write(struct xdr_stream *xdr, struct nfs_writeres *res)
                 goto out_overflow;
         res->count = be32_to_cpup(p++);
         res->verf->committed = be32_to_cpup(p++);
-       memcpy(res->verf->verifier, p, 8);
+       memcpy(res->verf->verifier, p, NFS4_VERIFIER_SIZE);
         return 0;
  out_overflow:
         print_overflow_msg(__func__, xdr);
@@ -5214,6 +5140,7 @@ static int decode_exchange_id(struct xdr_stream *xdr,
         char *dummy_str;
         int status;
         struct nfs_client *clp = res->client;
+       uint32_t impl_id_count;
  
         status = decode_op_hdr(xdr, OP_EXCHANGE_ID);
         if (status)
@@ -5255,11 +5182,38 @@ static int decode_exchange_id(struct xdr_stream *xdr,
         memcpy(res->server_scope->server_scope, dummy_str, dummy);
         res->server_scope->server_scope_sz = dummy;
  
-       /* Throw away Implementation id array */
-       status = decode_opaque_inline(xdr, &dummy, &dummy_str);
-       if (unlikely(status))
-               return status;
+       /* Implementation Id */
+       p = xdr_inline_decode(xdr, 4);
+       if (unlikely(!p))
+               goto out_overflow;
+       impl_id_count = be32_to_cpup(p++);
+
+       if (impl_id_count) {
+               /* nii_domain */
+               status = decode_opaque_inline(xdr, &dummy, &dummy_str);
+               if (unlikely(status))
+                       return status;
+               if (unlikely(dummy > NFS4_OPAQUE_LIMIT))
+                       return -EIO;
+               memcpy(res->impl_id->domain, dummy_str, dummy);
  
+               /* nii_name */
+               status = decode_opaque_inline(xdr, &dummy, &dummy_str);
+               if (unlikely(status))
+                       return status;
+               if (unlikely(dummy > NFS4_OPAQUE_LIMIT))
+                       return -EIO;
+               memcpy(res->impl_id->name, dummy_str, dummy);
+
+               /* nii_date */
+               p = xdr_inline_decode(xdr, 12);
+               if (unlikely(!p))
+                       goto out_overflow;
+               p = xdr_decode_hyper(p, &res->impl_id->date.seconds);
+               res->impl_id->date.nseconds = be32_to_cpup(p);
+
+               /* if there's more than one entry, ignore the rest */
+       }
         return 0;
  out_overflow:
         print_overflow_msg(__func__, xdr);
@@ -5285,8 +5239,8 @@ static int decode_chan_attrs(struct xdr_stream *xdr,
         attrs->max_reqs = be32_to_cpup(p++);
         nr_attrs = be32_to_cpup(p);
         if (unlikely(nr_attrs > 1)) {
-               printk(KERN_WARNING "%s: Invalid rdma channel attrs count %u\n",
-                       __func__, nr_attrs);
+               printk(KERN_WARNING "NFS: %s: Invalid rdma channel attrs "
+                       "count %u\n", __func__, nr_attrs);
                 return -EINVAL;
         }
         if (nr_attrs == 1) {
@@ -5436,14 +5390,14 @@ static int decode_getdevicelist(struct xdr_stream *xdr,
         p += 2;
  
         /* Read verifier */
-       p = xdr_decode_opaque_fixed(p, verftemp.verifier, 8);
+       p = xdr_decode_opaque_fixed(p, verftemp.verifier, NFS4_VERIFIER_SIZE);
  
         res->num_devs = be32_to_cpup(p);
  
         dprintk("%s: num_dev %d\n", __func__, res->num_devs);
  
         if (res->num_devs > NFS4_PNFS_GETDEVLIST_MAXNUM) {
-               printk(KERN_ERR "%s too many result dev_num %u\n",
+               printk(KERN_ERR "NFS: %s too many result dev_num %u\n",
                                 __func__, res->num_devs);
                 return -EIO;
         }
@@ -5537,11 +5491,14 @@ static int decode_layoutget(struct xdr_stream *xdr, struct rpc_rqst *req,
         status = decode_op_hdr(xdr, OP_LAYOUTGET);
         if (status)
                 return status;
-       p = xdr_inline_decode(xdr, 8 + NFS4_STATEID_SIZE);
+       p = xdr_inline_decode(xdr, 4);
+       if (unlikely(!p))
+               goto out_overflow;
+       res->return_on_close = be32_to_cpup(p);
+       decode_stateid(xdr, &res->stateid);
+       p = xdr_inline_decode(xdr, 4);
         if (unlikely(!p))
                 goto out_overflow;
-       res->return_on_close = be32_to_cpup(p++);
-       p = xdr_decode_opaque_fixed(p, res->stateid.data, NFS4_STATEID_SIZE);
         layout_count = be32_to_cpup(p);
         if (!layout_count) {
                 dprintk("%s: server responded with empty layout array\n",
@@ -5666,7 +5623,8 @@ static int decode_test_stateid(struct xdr_stream *xdr,
         if (unlikely(!p))
                 goto out_overflow;
         res->status = be32_to_cpup(p++);
-       return res->status;
+
+       return status;
  out_overflow:
         print_overflow_msg(__func__, xdr);
  out:
@@ -6583,8 +6541,9 @@ static int nfs4_xdr_dec_fs_locations(struct rpc_rqst *req,
         if (status)
                 goto out;
         xdr_enter_page(xdr, PAGE_SIZE);
-       status = decode_getfattr(xdr, &res->fs_locations->fattr,
-                                res->fs_locations->server);
+       status = decode_getfattr_generic(xdr, &res->fs_locations->fattr,
+                                        NULL, res->fs_locations,
+                                        res->fs_locations->server);
  out:
         return status;
  }
@@ -6964,7 +6923,7 @@ int nfs4_decode_dirent(struct xdr_stream *xdr, struct nfs_entry *entry,
                 goto out_overflow;
  
         if (decode_getfattr_attrs(xdr, bitmap, entry->fattr, entry->fh,
-                                       entry->server) < 0)
+                                 NULL, entry->server) < 0)
                 goto out_overflow;
         if (entry->fattr->valid & NFS_ATTR_FATTR_MOUNTED_ON_FILEID)
                 entry->ino = entry->fattr->mounted_on_fileid;
@@ -7112,7 +7071,7 @@ struct rpc_procinfo       nfs4_procedures[] = {
  #endif /* CONFIG_NFS_V4_1 */
  };
  
-struct rpc_version             nfs_version4 = {
+const struct rpc_version nfs_version4 = {
         .number                 = 4,
         .nrprocs                = ARRAY_SIZE(nfs4_procedures),
         .procs                  = nfs4_procedures
diff --git a/fs/nfs/nfsroot.c b/fs/nfs/nfsroot.c

index c4744e1d513c826545898e3310631c5f8153ae98..cd3c910d2d129ee687d197da97b00c9c0cb3cc13 100644 (file)
--- a/fs/nfs/nfsroot.c
+++ b/fs/nfs/nfsroot.c
@@ -104,7 +104,7 @@ static char nfs_export_path[NFS_MAXPATHLEN + 1] __initdata = "";
  /* server:export path string passed to super.c */
  static char nfs_root_device[NFS_MAXPATHLEN + 1] __initdata = "";
  
-#ifdef RPC_DEBUG
+#ifdef NFS_DEBUG
  /*
   * When the "nfsrootdebug" kernel command line option is specified,
   * enable debugging messages for NFSROOT.
diff --git a/fs/nfs/objlayout/objio_osd.c b/fs/nfs/objlayout/objio_osd.c

index 55d01280a6098264cc5e6d7133c72347e392d109..4bff4a3dab4602ffa8fe1f48df5d3adc3e8709c3 100644 (file)
--- a/fs/nfs/objlayout/objio_osd.c
+++ b/fs/nfs/objlayout/objio_osd.c
@@ -137,6 +137,7 @@ static int objio_devices_lookup(struct pnfs_layout_hdr *pnfslay,
         struct objio_dev_ent *ode;
         struct osd_dev *od;
         struct osd_dev_info odi;
+       bool retry_flag = true;
         int err;
  
         ode = _dev_list_find(NFS_SERVER(pnfslay->plh_inode), d_id);
@@ -171,10 +172,18 @@ static int objio_devices_lookup(struct pnfs_layout_hdr *pnfslay,
                 goto out;
         }
  
+retry_lookup:
         od = osduld_info_lookup(&odi);
         if (unlikely(IS_ERR(od))) {
                 err = PTR_ERR(od);
                 dprintk("%s: osduld_info_lookup => %d\n", __func__, err);
+               if (err == -ENODEV && retry_flag) {
+                       err = objlayout_autologin(deviceaddr);
+                       if (likely(!err)) {
+                               retry_flag = false;
+                               goto retry_lookup;
+                       }
+               }
                 goto out;
         }
  
@@ -205,25 +214,36 @@ static void copy_single_comp(struct ore_components *oc, unsigned c,
  int __alloc_objio_seg(unsigned numdevs, gfp_t gfp_flags,
                        struct objio_segment **pseg)
  {
-       struct __alloc_objio_segment {
-               struct objio_segment olseg;
-               struct ore_dev *ods[numdevs];
-               struct ore_comp comps[numdevs];
-       } *aolseg;
-
-       aolseg = kzalloc(sizeof(*aolseg), gfp_flags);
-       if (unlikely(!aolseg)) {
+/*     This is the in memory structure of the objio_segment
+ *
+ *     struct __alloc_objio_segment {
+ *             struct objio_segment olseg;
+ *             struct ore_dev *ods[numdevs];
+ *             struct ore_comp comps[numdevs];
+ *     } *aolseg;
+ *     NOTE: The code as above compiles and runs perfectly. It is elegant,
+ *     type safe and compact. At some Past time Linus has decided he does not
+ *     like variable length arrays, For the sake of this principal we uglify
+ *     the code as below.
+ */
+       struct objio_segment *lseg;
+       size_t lseg_size = sizeof(*lseg) +
+                       numdevs * sizeof(lseg->oc.ods[0]) +
+                       numdevs * sizeof(*lseg->oc.comps);
+
+       lseg = kzalloc(lseg_size, gfp_flags);
+       if (unlikely(!lseg)) {
                 dprintk("%s: Faild allocation numdevs=%d size=%zd\n", __func__,
-                       numdevs, sizeof(*aolseg));
+                       numdevs, lseg_size);
                 return -ENOMEM;
         }
  
-       aolseg->olseg.oc.numdevs = numdevs;
-       aolseg->olseg.oc.single_comp = EC_MULTPLE_COMPS;
-       aolseg->olseg.oc.comps = aolseg->comps;
-       aolseg->olseg.oc.ods = aolseg->ods;
+       lseg->oc.numdevs = numdevs;
+       lseg->oc.single_comp = EC_MULTPLE_COMPS;
+       lseg->oc.ods = (void *)(lseg + 1);
+       lseg->oc.comps = (void *)(lseg->oc.ods + numdevs);
  
-       *pseg = &aolseg->olseg;
+       *pseg = lseg;
         return 0;
  }
  
@@ -582,10 +602,10 @@ objlayout_init(void)
  
         if (ret)
                 printk(KERN_INFO
-                       "%s: Registering OSD pNFS Layout Driver failed: error=%d\n",
+                       "NFS: %s: Registering OSD pNFS Layout Driver failed: error=%d\n",
                         __func__, ret);
         else
-               printk(KERN_INFO "%s: Registered OSD pNFS Layout Driver\n",
+               printk(KERN_INFO "NFS: %s: Registered OSD pNFS Layout Driver\n",
                         __func__);
         return ret;
  }
@@ -594,7 +614,7 @@ static void __exit
  objlayout_exit(void)
  {
         pnfs_unregister_layoutdriver(&objlayout_type);
-       printk(KERN_INFO "%s: Unregistered OSD pNFS Layout Driver\n",
+       printk(KERN_INFO "NFS: %s: Unregistered OSD pNFS Layout Driver\n",
                __func__);
  }
  
diff --git a/fs/nfs/objlayout/objlayout.c b/fs/nfs/objlayout/objlayout.c

index b3c29039f5b893e69058cd404547d218cfa8dff8..8d45f1c318ce40ac453b7b4a71288e71ba3c6a34 100644 (file)
--- a/fs/nfs/objlayout/objlayout.c
+++ b/fs/nfs/objlayout/objlayout.c
@@ -37,6 +37,9 @@
   *  SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
   */
  
+#include <linux/kmod.h>
+#include <linux/moduleparam.h>
+#include <linux/ratelimit.h>
  #include <scsi/osd_initiator.h>
  #include "objlayout.h"
  
@@ -156,7 +159,7 @@ last_byte_offset(u64 start, u64 len)
         return end > start ? end - 1 : NFS4_MAX_UINT64;
  }
  
-void _fix_verify_io_params(struct pnfs_layout_segment *lseg,
+static void _fix_verify_io_params(struct pnfs_layout_segment *lseg,
                            struct page ***p_pages, unsigned *p_pgbase,
                            u64 offset, unsigned long count)
  {
@@ -490,9 +493,9 @@ encode_accumulated_error(struct objlayout *objlay, __be32 *p)
                         if (!ioerr->oer_errno)
                                 continue;
  
-                       printk(KERN_ERR "%s: err[%d]: errno=%d is_write=%d "
-                               "dev(%llx:%llx) par=0x%llx obj=0x%llx "
-                               "offset=0x%llx length=0x%llx\n",
+                       printk(KERN_ERR "NFS: %s: err[%d]: errno=%d "
+                               "is_write=%d dev(%llx:%llx) par=0x%llx "
+                               "obj=0x%llx offset=0x%llx length=0x%llx\n",
                                 __func__, i, ioerr->oer_errno,
                                 ioerr->oer_iswrite,
                                 _DEVID_LO(&ioerr->oer_component.oid_device_id),
@@ -651,3 +654,134 @@ void objlayout_put_deviceinfo(struct pnfs_osd_deviceaddr *deviceaddr)
         __free_page(odi->page);
         kfree(odi);
  }
+
+enum {
+       OBJLAYOUT_MAX_URI_LEN = 256, OBJLAYOUT_MAX_OSDNAME_LEN = 64,
+       OBJLAYOUT_MAX_SYSID_HEX_LEN = OSD_SYSTEMID_LEN * 2 + 1,
+       OSD_LOGIN_UPCALL_PATHLEN  = 256
+};
+
+static char osd_login_prog[OSD_LOGIN_UPCALL_PATHLEN] = "/sbin/osd_login";
+
+module_param_string(osd_login_prog, osd_login_prog, sizeof(osd_login_prog),
+                   0600);
+MODULE_PARM_DESC(osd_login_prog, "Path to the osd_login upcall program");
+
+struct __auto_login {
+       char uri[OBJLAYOUT_MAX_URI_LEN];
+       char osdname[OBJLAYOUT_MAX_OSDNAME_LEN];
+       char systemid_hex[OBJLAYOUT_MAX_SYSID_HEX_LEN];
+};
+
+static int __objlayout_upcall(struct __auto_login *login)
+{
+       static char *envp[] = { "HOME=/",
+               "TERM=linux",
+               "PATH=/sbin:/usr/sbin:/bin:/usr/bin",
+               NULL
+       };
+       char *argv[8];
+       int ret;
+
+       if (unlikely(!osd_login_prog[0])) {
+               dprintk("%s: osd_login_prog is disabled\n", __func__);
+               return -EACCES;
+       }
+
+       dprintk("%s uri: %s\n", __func__, login->uri);
+       dprintk("%s osdname %s\n", __func__, login->osdname);
+       dprintk("%s systemid_hex %s\n", __func__, login->systemid_hex);
+
+       argv[0] = (char *)osd_login_prog;
+       argv[1] = "-u";
+       argv[2] = login->uri;
+       argv[3] = "-o";
+       argv[4] = login->osdname;
+       argv[5] = "-s";
+       argv[6] = login->systemid_hex;
+       argv[7] = NULL;
+
+       ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+       /*
+        * Disable the upcall mechanism if we're getting an ENOENT or
+        * EACCES error. The admin can re-enable it on the fly by using
+        * sysfs to set the objlayoutdriver.osd_login_prog module parameter once
+        * the problem has been fixed.
+        */
+       if (ret == -ENOENT || ret == -EACCES) {
+               printk(KERN_ERR "PNFS-OBJ: %s was not found please set "
+                       "objlayoutdriver.osd_login_prog kernel parameter!\n",
+                       osd_login_prog);
+               osd_login_prog[0] = '\0';
+       }
+       dprintk("%s %s return value: %d\n", __func__, osd_login_prog, ret);
+
+       return ret;
+}
+
+/* Assume dest is all zeros */
+static void __copy_nfsS_and_zero_terminate(struct nfs4_string s,
+                                          char *dest, int max_len,
+                                          const char *var_name)
+{
+       if (!s.len)
+               return;
+
+       if (s.len >= max_len) {
+               pr_warn_ratelimited(
+                       "objlayout_autologin: %s: s.len(%d) >= max_len(%d)",
+                       var_name, s.len, max_len);
+               s.len = max_len - 1; /* space for null terminator */
+       }
+
+       memcpy(dest, s.data, s.len);
+}
+
+/* Assume sysid is all zeros */
+static void _sysid_2_hex(struct nfs4_string s,
+                 char sysid[OBJLAYOUT_MAX_SYSID_HEX_LEN])
+{
+       int i;
+       char *cur;
+
+       if (!s.len)
+               return;
+
+       if (s.len != OSD_SYSTEMID_LEN) {
+               pr_warn_ratelimited(
+                   "objlayout_autologin: systemid_len(%d) != OSD_SYSTEMID_LEN",
+                   s.len);
+               if (s.len > OSD_SYSTEMID_LEN)
+                       s.len = OSD_SYSTEMID_LEN;
+       }
+
+       cur = sysid;
+       for (i = 0; i < s.len; i++)
+               cur = hex_byte_pack(cur, s.data[i]);
+}
+
+int objlayout_autologin(struct pnfs_osd_deviceaddr *deviceaddr)
+{
+       int rc;
+       struct __auto_login login;
+
+       if (!deviceaddr->oda_targetaddr.ota_netaddr.r_addr.len)
+               return -ENODEV;
+
+       memset(&login, 0, sizeof(login));
+       __copy_nfsS_and_zero_terminate(
+               deviceaddr->oda_targetaddr.ota_netaddr.r_addr,
+               login.uri, sizeof(login.uri), "URI");
+
+       __copy_nfsS_and_zero_terminate(
+               deviceaddr->oda_osdname,
+               login.osdname, sizeof(login.osdname), "OSDNAME");
+
+       _sysid_2_hex(deviceaddr->oda_systemid, login.systemid_hex);
+
+       rc = __objlayout_upcall(&login);
+       if (rc > 0) /* script returns positive values */
+               rc = -ENODEV;
+
+       return rc;
+}
diff --git a/fs/nfs/objlayout/objlayout.h b/fs/nfs/objlayout/objlayout.h

index 8ec34727ed210fcf306376a0e7e9a7cf835ba435..880ba086be9499315d59d3957ed29d5e915f0ceb 100644 (file)
--- a/fs/nfs/objlayout/objlayout.h
+++ b/fs/nfs/objlayout/objlayout.h
@@ -184,4 +184,6 @@ extern void objlayout_encode_layoutreturn(
         struct xdr_stream *,
         const struct nfs4_layoutreturn_args *);
  
+extern int objlayout_autologin(struct pnfs_osd_deviceaddr *deviceaddr);
+
  #endif /* _OBJLAYOUT_H */
diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c

index 5668f7c54c41e2d35ff1afd2c24a2bf786406c71..d21fceaa9f6263fecff450506653c21ba055872f 100644 (file)
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -13,6 +13,7 @@
  #include <linux/file.h>
  #include <linux/sched.h>
  #include <linux/sunrpc/clnt.h>
+#include <linux/nfs.h>
  #include <linux/nfs3.h>
  #include <linux/nfs4.h>
  #include <linux/nfs_page.h>
@@ -106,36 +107,6 @@ void nfs_unlock_request(struct nfs_page *req)
         nfs_release_request(req);
  }
  
-/**
- * nfs_set_page_tag_locked - Tag a request as locked
- * @req:
- */
-int nfs_set_page_tag_locked(struct nfs_page *req)
-{
-       if (!nfs_lock_request_dontget(req))
-               return 0;
-       if (test_bit(PG_MAPPED, &req->wb_flags))
-               radix_tree_tag_set(&NFS_I(req->wb_context->dentry->d_inode)->nfs_page_tree, req->wb_index, NFS_PAGE_TAG_LOCKED);
-       return 1;
-}
-
-/**
- * nfs_clear_page_tag_locked - Clear request tag and wake up sleepers
- */
-void nfs_clear_page_tag_locked(struct nfs_page *req)
-{
-       if (test_bit(PG_MAPPED, &req->wb_flags)) {
-               struct inode *inode = req->wb_context->dentry->d_inode;
-               struct nfs_inode *nfsi = NFS_I(inode);
-
-               spin_lock(&inode->i_lock);
-               radix_tree_tag_clear(&nfsi->nfs_page_tree, req->wb_index, NFS_PAGE_TAG_LOCKED);
-               nfs_unlock_request(req);
-               spin_unlock(&inode->i_lock);
-       } else
-               nfs_unlock_request(req);
-}
-
  /*
   * nfs_clear_request - Free up all resources allocated to the request
   * @req:
@@ -425,67 +396,6 @@ void nfs_pageio_cond_complete(struct nfs_pageio_descriptor *desc, pgoff_t index)
         }
  }
  
-#define NFS_SCAN_MAXENTRIES 16
-/**
- * nfs_scan_list - Scan a list for matching requests
- * @nfsi: NFS inode
- * @dst: Destination list
- * @idx_start: lower bound of page->index to scan
- * @npages: idx_start + npages sets the upper bound to scan.
- * @tag: tag to scan for
- *
- * Moves elements from one of the inode request lists.
- * If the number of requests is set to 0, the entire address_space
- * starting at index idx_start, is scanned.
- * The requests are *not* checked to ensure that they form a contiguous set.
- * You must be holding the inode's i_lock when calling this function
- */
-int nfs_scan_list(struct nfs_inode *nfsi,
-               struct list_head *dst, pgoff_t idx_start,
-               unsigned int npages, int tag)
-{
-       struct nfs_page *pgvec[NFS_SCAN_MAXENTRIES];
-       struct nfs_page *req;
-       pgoff_t idx_end;
-       int found, i;
-       int res;
-       struct list_head *list;
-
-       res = 0;
-       if (npages == 0)
-               idx_end = ~0;
-       else
-               idx_end = idx_start + npages - 1;
-
-       for (;;) {
-               found = radix_tree_gang_lookup_tag(&nfsi->nfs_page_tree,
-                               (void **)&pgvec[0], idx_start,
-                               NFS_SCAN_MAXENTRIES, tag);
-               if (found <= 0)
-                       break;
-               for (i = 0; i < found; i++) {
-                       req = pgvec[i];
-                       if (req->wb_index > idx_end)
-                               goto out;
-                       idx_start = req->wb_index + 1;
-                       if (nfs_set_page_tag_locked(req)) {
-                               kref_get(&req->wb_kref);
-                               radix_tree_tag_clear(&nfsi->nfs_page_tree,
-                                               req->wb_index, tag);
-                               list = pnfs_choose_commit_list(req, dst);
-                               nfs_list_add_request(req, list);
-                               res++;
-                               if (res == INT_MAX)
-                                       goto out;
-                       }
-               }
-               /* for latency reduction */
-               cond_resched_lock(&nfsi->vfs_inode.i_lock);
-       }
-out:
-       return res;
-}
-
  int __init nfs_init_nfspagecache(void)
  {
         nfs_page_cachep = kmem_cache_create("nfs_page",
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c

index 17149a4900653af5326dfd6e0490e6e439b1426f..b5d4515869436dc6bd16a483590a433ac04c665c 100644 (file)
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -101,8 +101,8 @@ set_pnfs_layoutdriver(struct nfs_server *server, const struct nfs_fh *mntfh,
                 goto out_no_driver;
         if (!(server->nfs_client->cl_exchange_flags &
                  (EXCHGID4_FLAG_USE_NON_PNFS | EXCHGID4_FLAG_USE_PNFS_MDS))) {
-               printk(KERN_ERR "%s: id %u cl_exchange_flags 0x%x\n", __func__,
-                      id, server->nfs_client->cl_exchange_flags);
+               printk(KERN_ERR "NFS: %s: id %u cl_exchange_flags 0x%x\n",
+                       __func__, id, server->nfs_client->cl_exchange_flags);
                 goto out_no_driver;
         }
         ld_type = find_pnfs_driver(id);
@@ -122,8 +122,8 @@ set_pnfs_layoutdriver(struct nfs_server *server, const struct nfs_fh *mntfh,
         server->pnfs_curr_ld = ld_type;
         if (ld_type->set_layoutdriver
             && ld_type->set_layoutdriver(server, mntfh)) {
-               printk(KERN_ERR "%s: Error initializing pNFS layout driver %u.\n",
-                               __func__, id);
+               printk(KERN_ERR "NFS: %s: Error initializing pNFS layout "
+                       "driver %u.\n", __func__, id);
                 module_put(ld_type->owner);
                 goto out_no_driver;
         }
@@ -143,11 +143,11 @@ pnfs_register_layoutdriver(struct pnfs_layoutdriver_type *ld_type)
         struct pnfs_layoutdriver_type *tmp;
  
         if (ld_type->id == 0) {
-               printk(KERN_ERR "%s id 0 is reserved\n", __func__);
+               printk(KERN_ERR "NFS: %s id 0 is reserved\n", __func__);
                 return status;
         }
         if (!ld_type->alloc_lseg || !ld_type->free_lseg) {
-               printk(KERN_ERR "%s Layout driver must provide "
+               printk(KERN_ERR "NFS: %s Layout driver must provide "
                        "alloc_lseg and free_lseg.\n", __func__);
                 return status;
         }
@@ -160,7 +160,7 @@ pnfs_register_layoutdriver(struct pnfs_layoutdriver_type *ld_type)
                 dprintk("%s Registering id:%u name:%s\n", __func__, ld_type->id,
                         ld_type->name);
         } else {
-               printk(KERN_ERR "%s Module with id %d already loaded!\n",
+               printk(KERN_ERR "NFS: %s Module with id %d already loaded!\n",
                         __func__, ld_type->id);
         }
         spin_unlock(&pnfs_spinlock);
@@ -496,12 +496,12 @@ pnfs_set_layout_stateid(struct pnfs_layout_hdr *lo, const nfs4_stateid *new,
  {
         u32 oldseq, newseq;
  
-       oldseq = be32_to_cpu(lo->plh_stateid.stateid.seqid);
-       newseq = be32_to_cpu(new->stateid.seqid);
+       oldseq = be32_to_cpu(lo->plh_stateid.seqid);
+       newseq = be32_to_cpu(new->seqid);
         if ((int)(newseq - oldseq) > 0) {
-               memcpy(&lo->plh_stateid, &new->stateid, sizeof(new->stateid));
+               nfs4_stateid_copy(&lo->plh_stateid, new);
                 if (update_barrier) {
-                       u32 new_barrier = be32_to_cpu(new->stateid.seqid);
+                       u32 new_barrier = be32_to_cpu(new->seqid);
  
                         if ((int)(new_barrier - lo->plh_barrier))
                                 lo->plh_barrier = new_barrier;
@@ -525,7 +525,7 @@ pnfs_layoutgets_blocked(struct pnfs_layout_hdr *lo, nfs4_stateid *stateid,
                         int lget)
  {
         if ((stateid) &&
-           (int)(lo->plh_barrier - be32_to_cpu(stateid->stateid.seqid)) >= 0)
+           (int)(lo->plh_barrier - be32_to_cpu(stateid->seqid)) >= 0)
                 return true;
         return lo->plh_block_lgets ||
                 test_bit(NFS_LAYOUT_DESTROYED, &lo->plh_flags) ||
@@ -549,11 +549,10 @@ pnfs_choose_layoutget_stateid(nfs4_stateid *dst, struct pnfs_layout_hdr *lo,
  
                 do {
                         seq = read_seqbegin(&open_state->seqlock);
-                       memcpy(dst->data, open_state->stateid.data,
-                              sizeof(open_state->stateid.data));
+                       nfs4_stateid_copy(dst, &open_state->stateid);
                 } while (read_seqretry(&open_state->seqlock, seq));
         } else
-               memcpy(dst->data, lo->plh_stateid.data, sizeof(lo->plh_stateid.data));
+               nfs4_stateid_copy(dst, &lo->plh_stateid);
         spin_unlock(&lo->plh_inode->i_lock);
         dprintk("<-- %s\n", __func__);
         return status;
@@ -590,7 +589,7 @@ send_layoutget(struct pnfs_layout_hdr *lo,
         max_resp_sz = server->nfs_client->cl_session->fc_attrs.max_resp_sz;
         max_pages = max_resp_sz >> PAGE_SHIFT;
  
-       pages = kzalloc(max_pages * sizeof(struct page *), gfp_flags);
+       pages = kcalloc(max_pages, sizeof(struct page *), gfp_flags);
         if (!pages)
                 goto out_err_free;
  
@@ -760,7 +759,7 @@ bool pnfs_roc_drain(struct inode *ino, u32 *barrier)
                 }
         if (!found) {
                 struct pnfs_layout_hdr *lo = nfsi->layout;
-               u32 current_seqid = be32_to_cpu(lo->plh_stateid.stateid.seqid);
+               u32 current_seqid = be32_to_cpu(lo->plh_stateid.seqid);
  
                 /* Since close does not return a layout stateid for use as
                  * a barrier, we choose the worst-case barrier.
@@ -966,8 +965,7 @@ pnfs_update_layout(struct inode *ino,
         }
  
         /* Do we even need to bother with this? */
-       if (test_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state) ||
-           test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
+       if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
                 dprintk("%s matches recall, use MDS\n", __func__);
                 goto out_unlock;
         }
@@ -1032,7 +1030,6 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
         struct nfs4_layoutget_res *res = &lgp->res;
         struct pnfs_layout_segment *lseg;
         struct inode *ino = lo->plh_inode;
-       struct nfs_client *clp = NFS_SERVER(ino)->nfs_client;
         int status = 0;
  
         /* Inject layout blob into I/O device driver */
@@ -1048,8 +1045,7 @@ pnfs_layout_process(struct nfs4_layoutget *lgp)
         }
  
         spin_lock(&ino->i_lock);
-       if (test_bit(NFS4CLNT_LAYOUTRECALL, &clp->cl_state) ||
-           test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
+       if (test_bit(NFS_LAYOUT_BULK_RECALL, &lo->plh_flags)) {
                 dprintk("%s forget reply due to recall\n", __func__);
                 goto out_forget_reply;
         }
@@ -1214,6 +1210,7 @@ void pnfs_ld_write_done(struct nfs_write_data *data)
                 }
                 data->task.tk_status = pnfs_write_done_resend_to_mds(data->inode, &data->pages);
         }
+       put_lseg(data->lseg);
         data->mds_ops->rpc_release(data);
  }
  EXPORT_SYMBOL_GPL(pnfs_ld_write_done);
@@ -1227,6 +1224,7 @@ pnfs_write_through_mds(struct nfs_pageio_descriptor *desc,
                 nfs_list_add_request(data->req, &desc->pg_list);
         nfs_pageio_reset_write_mds(desc);
         desc->pg_recoalesce = 1;
+       put_lseg(data->lseg);
         nfs_writedata_release(data);
  }
  
@@ -1327,6 +1325,7 @@ void pnfs_ld_read_done(struct nfs_read_data *data)
                 data->mds_ops->rpc_call_done(&data->task, data);
         } else
                 pnfs_ld_handle_read_error(data);
+       put_lseg(data->lseg);
         data->mds_ops->rpc_release(data);
  }
  EXPORT_SYMBOL_GPL(pnfs_ld_read_done);
@@ -1530,8 +1529,7 @@ pnfs_layoutcommit_inode(struct inode *inode, bool sync)
         end_pos = nfsi->layout->plh_lwb;
         nfsi->layout->plh_lwb = 0;
  
-       memcpy(&data->args.stateid.data, nfsi->layout->plh_stateid.data,
-               sizeof(nfsi->layout->plh_stateid.data));
+       nfs4_stateid_copy(&data->args.stateid, &nfsi->layout->plh_stateid);
         spin_unlock(&inode->i_lock);
  
         data->args.inode = inode;
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h

index 53d593a0a4f265a69c9f4fbc5d2ccb759291686a..442ebf68eeecf51dfaa6b8835318b53010eefe19 100644 (file)
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -94,11 +94,10 @@ struct pnfs_layoutdriver_type {
         const struct nfs_pageio_ops *pg_read_ops;
         const struct nfs_pageio_ops *pg_write_ops;
  
-       /* Returns true if layoutdriver wants to divert this request to
-        * driver's commit routine.
-        */
-       bool (*mark_pnfs_commit)(struct pnfs_layout_segment *lseg);
-       struct list_head * (*choose_commit_list) (struct nfs_page *req);
+       void (*mark_request_commit) (struct nfs_page *req,
+                                       struct pnfs_layout_segment *lseg);
+       void (*clear_request_commit) (struct nfs_page *req);
+       int (*scan_commit_lists) (struct inode *inode, int max, spinlock_t *lock);
         int (*commit_pagelist)(struct inode *inode, struct list_head *mds_pages, int how);
  
         /*
@@ -229,7 +228,6 @@ struct nfs4_deviceid_node {
         atomic_t                        ref;
  };
  
-void nfs4_print_deviceid(const struct nfs4_deviceid *dev_id);
  struct nfs4_deviceid_node *nfs4_find_get_deviceid(const struct pnfs_layoutdriver_type *, const struct nfs_client *, const struct nfs4_deviceid *);
  void nfs4_delete_deviceid(const struct pnfs_layoutdriver_type *, const struct nfs_client *, const struct nfs4_deviceid *);
  void nfs4_init_deviceid_node(struct nfs4_deviceid_node *,
@@ -262,20 +260,6 @@ static inline int pnfs_enabled_sb(struct nfs_server *nfss)
         return nfss->pnfs_curr_ld != NULL;
  }
  
-static inline void
-pnfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
-{
-       if (lseg) {
-               struct pnfs_layoutdriver_type *ld;
-
-               ld = NFS_SERVER(req->wb_page->mapping->host)->pnfs_curr_ld;
-               if (ld->mark_pnfs_commit && ld->mark_pnfs_commit(lseg)) {
-                       set_bit(PG_PNFS_COMMIT, &req->wb_flags);
-                       req->wb_commit_lseg = get_lseg(lseg);
-               }
-       }
-}
-
  static inline int
  pnfs_commit_list(struct inode *inode, struct list_head *mds_pages, int how)
  {
@@ -284,27 +268,42 @@ pnfs_commit_list(struct inode *inode, struct list_head *mds_pages, int how)
         return NFS_SERVER(inode)->pnfs_curr_ld->commit_pagelist(inode, mds_pages, how);
  }
  
-static inline struct list_head *
-pnfs_choose_commit_list(struct nfs_page *req, struct list_head *mds)
+static inline bool
+pnfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
  {
-       struct list_head *rv;
+       struct inode *inode = req->wb_context->dentry->d_inode;
+       struct pnfs_layoutdriver_type *ld = NFS_SERVER(inode)->pnfs_curr_ld;
  
-       if (test_and_clear_bit(PG_PNFS_COMMIT, &req->wb_flags)) {
-               struct inode *inode = req->wb_commit_lseg->pls_layout->plh_inode;
+       if (lseg == NULL || ld->mark_request_commit == NULL)
+               return false;
+       ld->mark_request_commit(req, lseg);
+       return true;
+}
  
-               set_bit(NFS_INO_PNFS_COMMIT, &NFS_I(inode)->flags);
-               rv = NFS_SERVER(inode)->pnfs_curr_ld->choose_commit_list(req);
-               /* matched by ref taken when PG_PNFS_COMMIT is set */
-               put_lseg(req->wb_commit_lseg);
-       } else
-               rv = mds;
-       return rv;
+static inline bool
+pnfs_clear_request_commit(struct nfs_page *req)
+{
+       struct inode *inode = req->wb_context->dentry->d_inode;
+       struct pnfs_layoutdriver_type *ld = NFS_SERVER(inode)->pnfs_curr_ld;
+
+       if (ld == NULL || ld->clear_request_commit == NULL)
+               return false;
+       ld->clear_request_commit(req);
+       return true;
  }
  
-static inline void pnfs_clear_request_commit(struct nfs_page *req)
+static inline int
+pnfs_scan_commit_lists(struct inode *inode, int max, spinlock_t *lock)
  {
-       if (test_and_clear_bit(PG_PNFS_COMMIT, &req->wb_flags))
-               put_lseg(req->wb_commit_lseg);
+       struct pnfs_layoutdriver_type *ld = NFS_SERVER(inode)->pnfs_curr_ld;
+       int ret;
+
+       if (ld == NULL || ld->scan_commit_lists == NULL)
+               return 0;
+       ret = ld->scan_commit_lists(inode, max, lock);
+       if (ret != 0)
+               set_bit(NFS_INO_PNFS_COMMIT, &NFS_I(inode)->flags);
+       return ret;
  }
  
  /* Should the pNFS client commit and return the layout upon a setattr */
@@ -328,6 +327,13 @@ static inline int pnfs_return_layout(struct inode *ino)
         return 0;
  }
  
+#ifdef NFS_DEBUG
+void nfs4_print_deviceid(const struct nfs4_deviceid *dev_id);
+#else
+static inline void nfs4_print_deviceid(const struct nfs4_deviceid *dev_id)
+{
+}
+#endif /* NFS_DEBUG */
  #else  /* CONFIG_NFS_V4_1 */
  
  static inline void pnfs_destroy_all_layouts(struct nfs_client *clp)
@@ -400,35 +406,35 @@ static inline bool pnfs_pageio_init_write(struct nfs_pageio_descriptor *pgio, st
         return false;
  }
  
-static inline void
-pnfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
-{
-}
-
  static inline int
  pnfs_commit_list(struct inode *inode, struct list_head *mds_pages, int how)
  {
         return PNFS_NOT_ATTEMPTED;
  }
  
-static inline struct list_head *
-pnfs_choose_commit_list(struct nfs_page *req, struct list_head *mds)
+static inline bool
+pnfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
  {
-       return mds;
+       return false;
  }
  
-static inline void pnfs_clear_request_commit(struct nfs_page *req)
+static inline bool
+pnfs_clear_request_commit(struct nfs_page *req)
  {
+       return false;
  }
  
-static inline int pnfs_layoutcommit_inode(struct inode *inode, bool sync)
+static inline int
+pnfs_scan_commit_lists(struct inode *inode, int max, spinlock_t *lock)
  {
         return 0;
  }
  
-static inline void nfs4_deviceid_purge_client(struct nfs_client *ncl)
+static inline int pnfs_layoutcommit_inode(struct inode *inode, bool sync)
  {
+       return 0;
  }
+
  #endif /* CONFIG_NFS_V4_1 */
  
  #endif /* FS_NFS_PNFS_H */
diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c

index 4f359d2a26ebe3ce4a2160758c2a5e6c69163ba6..73f701f1f4d3325e2c54efb68e14b1df40eb90e1 100644 (file)
--- a/fs/nfs/pnfs_dev.c
+++ b/fs/nfs/pnfs_dev.c
@@ -43,6 +43,7 @@
  static struct hlist_head nfs4_deviceid_cache[NFS4_DEVICE_ID_HASH_SIZE];
  static DEFINE_SPINLOCK(nfs4_deviceid_lock);
  
+#ifdef NFS_DEBUG
  void
  nfs4_print_deviceid(const struct nfs4_deviceid *id)
  {
@@ -52,6 +53,7 @@ nfs4_print_deviceid(const struct nfs4_deviceid *id)
                 p[0], p[1], p[2], p[3]);
  }
  EXPORT_SYMBOL_GPL(nfs4_print_deviceid);
+#endif
  
  static inline u32
  nfs4_deviceid_hash(const struct nfs4_deviceid *id)
@@ -92,7 +94,7 @@ _lookup_deviceid(const struct pnfs_layoutdriver_type *ld,
   * @clp nfs_client associated with deviceid
   * @id deviceid to look up
   */
-struct nfs4_deviceid_node *
+static struct nfs4_deviceid_node *
  _find_get_deviceid(const struct pnfs_layoutdriver_type *ld,
                    const struct nfs_client *clp, const struct nfs4_deviceid *id,
                    long hash)
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c

index 0c672588fe5a71217ac83df8e1a11701934c5f9c..b63b6f4d14fbd5f54bdf265461c2a0069cbc5db0 100644 (file)
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -358,6 +358,11 @@ nfs_proc_unlink_setup(struct rpc_message *msg, struct inode *dir)
         msg->rpc_proc = &nfs_procedures[NFSPROC_REMOVE];
  }
  
+static void nfs_proc_unlink_rpc_prepare(struct rpc_task *task, struct nfs_unlinkdata *data)
+{
+       rpc_call_start(task);
+}
+
  static int nfs_proc_unlink_done(struct rpc_task *task, struct inode *dir)
  {
         if (nfs_async_handle_expired_key(task))
@@ -372,6 +377,11 @@ nfs_proc_rename_setup(struct rpc_message *msg, struct inode *dir)
         msg->rpc_proc = &nfs_procedures[NFSPROC_RENAME];
  }
  
+static void nfs_proc_rename_rpc_prepare(struct rpc_task *task, struct nfs_renamedata *data)
+{
+       rpc_call_start(task);
+}
+
  static int
  nfs_proc_rename_done(struct rpc_task *task, struct inode *old_dir,
                      struct inode *new_dir)
@@ -651,6 +661,11 @@ static void nfs_proc_read_setup(struct nfs_read_data *data, struct rpc_message *
         msg->rpc_proc = &nfs_procedures[NFSPROC_READ];
  }
  
+static void nfs_proc_read_rpc_prepare(struct rpc_task *task, struct nfs_read_data *data)
+{
+       rpc_call_start(task);
+}
+
  static int nfs_write_done(struct rpc_task *task, struct nfs_write_data *data)
  {
         if (nfs_async_handle_expired_key(task))
@@ -668,6 +683,11 @@ static void nfs_proc_write_setup(struct nfs_write_data *data, struct rpc_message
         msg->rpc_proc = &nfs_procedures[NFSPROC_WRITE];
  }
  
+static void nfs_proc_write_rpc_prepare(struct rpc_task *task, struct nfs_write_data *data)
+{
+       rpc_call_start(task);
+}
+
  static void
  nfs_proc_commit_setup(struct nfs_write_data *data, struct rpc_message *msg)
  {
@@ -721,9 +741,11 @@ const struct nfs_rpc_ops nfs_v2_clientops = {
         .create         = nfs_proc_create,
         .remove         = nfs_proc_remove,
         .unlink_setup   = nfs_proc_unlink_setup,
+       .unlink_rpc_prepare = nfs_proc_unlink_rpc_prepare,
         .unlink_done    = nfs_proc_unlink_done,
         .rename         = nfs_proc_rename,
         .rename_setup   = nfs_proc_rename_setup,
+       .rename_rpc_prepare = nfs_proc_rename_rpc_prepare,
         .rename_done    = nfs_proc_rename_done,
         .link           = nfs_proc_link,
         .symlink        = nfs_proc_symlink,
@@ -736,8 +758,10 @@ const struct nfs_rpc_ops nfs_v2_clientops = {
         .pathconf       = nfs_proc_pathconf,
         .decode_dirent  = nfs2_decode_dirent,
         .read_setup     = nfs_proc_read_setup,
+       .read_rpc_prepare = nfs_proc_read_rpc_prepare,
         .read_done      = nfs_read_done,
         .write_setup    = nfs_proc_write_setup,
+       .write_rpc_prepare = nfs_proc_write_rpc_prepare,
         .write_done     = nfs_write_done,
         .commit_setup   = nfs_proc_commit_setup,
         .lock           = nfs_proc_lock,
diff --git a/fs/nfs/read.c b/fs/nfs/read.c

index cfa175c223dcfa5b79ebf17b3d609649fbd7188d..cc1f758a7ee1a0234d491da9de8572eb1b45a44f 100644 (file)
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -66,7 +66,6 @@ void nfs_readdata_free(struct nfs_read_data *p)
  
  void nfs_readdata_release(struct nfs_read_data *rdata)
  {
-       put_lseg(rdata->lseg);
         put_nfs_open_context(rdata->args.context);
         nfs_readdata_free(rdata);
  }
@@ -465,23 +464,14 @@ static void nfs_readpage_release_partial(void *calldata)
         nfs_readdata_release(calldata);
  }
  
-#if defined(CONFIG_NFS_V4_1)
  void nfs_read_prepare(struct rpc_task *task, void *calldata)
  {
         struct nfs_read_data *data = calldata;
-
-       if (nfs4_setup_sequence(NFS_SERVER(data->inode),
-                               &data->args.seq_args, &data->res.seq_res,
-                               0, task))
-               return;
-       rpc_call_start(task);
+       NFS_PROTO(data->inode)->read_rpc_prepare(task, data);
  }
-#endif /* CONFIG_NFS_V4_1 */
  
  static const struct rpc_call_ops nfs_read_partial_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_read_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_readpage_result_partial,
         .rpc_release = nfs_readpage_release_partial,
  };
@@ -545,9 +535,7 @@ static void nfs_readpage_release_full(void *calldata)
  }
  
  static const struct rpc_call_ops nfs_read_full_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_read_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_readpage_result_full,
         .rpc_release = nfs_readpage_release_full,
  };
diff --git a/fs/nfs/super.c b/fs/nfs/super.c

index 3dfa4f112c0ab8be8d5b3f897173a68502406b6c..ccc4cdb1efe9a9e7842718ef24407557b73b382d 100644 (file)
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -52,6 +52,8 @@
  #include <linux/nfs_xdr.h>
  #include <linux/magic.h>
  #include <linux/parser.h>
+#include <linux/nsproxy.h>
+#include <linux/rcupdate.h>
  
  #include <asm/system.h>
  #include <asm/uaccess.h>
@@ -79,7 +81,6 @@ enum {
         Opt_cto, Opt_nocto,
         Opt_ac, Opt_noac,
         Opt_lock, Opt_nolock,
-       Opt_v2, Opt_v3, Opt_v4,
         Opt_udp, Opt_tcp, Opt_rdma,
         Opt_acl, Opt_noacl,
         Opt_rdirplus, Opt_nordirplus,
@@ -97,10 +98,10 @@ enum {
         Opt_namelen,
         Opt_mountport,
         Opt_mountvers,
-       Opt_nfsvers,
         Opt_minorversion,
  
         /* Mount options that take string arguments */
+       Opt_nfsvers,
         Opt_sec, Opt_proto, Opt_mountproto, Opt_mounthost,
         Opt_addr, Opt_mountaddr, Opt_clientaddr,
         Opt_lookupcache,
@@ -132,9 +133,6 @@ static const match_table_t nfs_mount_option_tokens = {
         { Opt_noac, "noac" },
         { Opt_lock, "lock" },
         { Opt_nolock, "nolock" },
-       { Opt_v2, "v2" },
-       { Opt_v3, "v3" },
-       { Opt_v4, "v4" },
         { Opt_udp, "udp" },
         { Opt_tcp, "tcp" },
         { Opt_rdma, "rdma" },
@@ -163,9 +161,10 @@ static const match_table_t nfs_mount_option_tokens = {
         { Opt_namelen, "namlen=%s" },
         { Opt_mountport, "mountport=%s" },
         { Opt_mountvers, "mountvers=%s" },
+       { Opt_minorversion, "minorversion=%s" },
+
         { Opt_nfsvers, "nfsvers=%s" },
         { Opt_nfsvers, "vers=%s" },
-       { Opt_minorversion, "minorversion=%s" },
  
         { Opt_sec, "sec=%s" },
         { Opt_proto, "proto=%s" },
@@ -179,6 +178,9 @@ static const match_table_t nfs_mount_option_tokens = {
         { Opt_fscache_uniq, "fsc=%s" },
         { Opt_local_lock, "local_lock=%s" },
  
+       /* The following needs to be listed after all other options */
+       { Opt_nfsvers, "v%s" },
+
         { Opt_err, NULL }
  };
  
@@ -259,6 +261,22 @@ static match_table_t nfs_local_lock_tokens = {
         { Opt_local_lock_err, NULL }
  };
  
+enum {
+       Opt_vers_2, Opt_vers_3, Opt_vers_4, Opt_vers_4_0,
+       Opt_vers_4_1,
+
+       Opt_vers_err
+};
+
+static match_table_t nfs_vers_tokens = {
+       { Opt_vers_2, "2" },
+       { Opt_vers_3, "3" },
+       { Opt_vers_4, "4" },
+       { Opt_vers_4_0, "4.0" },
+       { Opt_vers_4_1, "4.1" },
+
+       { Opt_vers_err, NULL }
+};
  
  static void nfs_umount_begin(struct super_block *);
  static int  nfs_statfs(struct dentry *, struct kstatfs *);
@@ -620,7 +638,6 @@ static void nfs_show_nfsv4_options(struct seq_file *m, struct nfs_server *nfss,
         struct nfs_client *clp = nfss->nfs_client;
  
         seq_printf(m, ",clientaddr=%s", clp->cl_ipaddr);
-       seq_printf(m, ",minorversion=%u", clp->cl_minorversion);
  }
  #else
  static void nfs_show_nfsv4_options(struct seq_file *m, struct nfs_server *nfss,
@@ -629,6 +646,15 @@ static void nfs_show_nfsv4_options(struct seq_file *m, struct nfs_server *nfss,
  }
  #endif
  
+static void nfs_show_nfs_version(struct seq_file *m,
+               unsigned int version,
+               unsigned int minorversion)
+{
+       seq_printf(m, ",vers=%u", version);
+       if (version == 4)
+               seq_printf(m, ".%u", minorversion);
+}
+
  /*
   * Describe the mount options in force on this server representation
   */
@@ -656,7 +682,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
         u32 version = clp->rpc_ops->version;
         int local_flock, local_fcntl;
  
-       seq_printf(m, ",vers=%u", version);
+       nfs_show_nfs_version(m, version, clp->cl_minorversion);
         seq_printf(m, ",rsize=%u", nfss->rsize);
         seq_printf(m, ",wsize=%u", nfss->wsize);
         if (nfss->bsize != 0)
@@ -676,8 +702,10 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
                 else
                         seq_puts(m, nfs_infop->nostr);
         }
+       rcu_read_lock();
         seq_printf(m, ",proto=%s",
                    rpc_peeraddr2str(nfss->client, RPC_DISPLAY_NETID));
+       rcu_read_unlock();
         if (version == 4) {
                 if (nfss->port != NFS_PORT)
                         seq_printf(m, ",port=%u", nfss->port);
@@ -726,9 +754,11 @@ static int nfs_show_options(struct seq_file *m, struct dentry *root)
  
         nfs_show_mount_options(m, nfss, 0);
  
+       rcu_read_lock();
         seq_printf(m, ",addr=%s",
                         rpc_peeraddr2str(nfss->nfs_client->cl_rpcclient,
                                                         RPC_DISPLAY_ADDR));
+       rcu_read_unlock();
  
         return 0;
  }
@@ -745,7 +775,6 @@ static void show_sessions(struct seq_file *m, struct nfs_server *server) {}
  #endif
  #endif
  
-#ifdef CONFIG_NFS_V4
  #ifdef CONFIG_NFS_V4_1
  static void show_pnfs(struct seq_file *m, struct nfs_server *server)
  {
@@ -755,9 +784,26 @@ static void show_pnfs(struct seq_file *m, struct nfs_server *server)
         else
                 seq_printf(m, "not configured");
  }
+
+static void show_implementation_id(struct seq_file *m, struct nfs_server *nfss)
+{
+       if (nfss->nfs_client && nfss->nfs_client->impl_id) {
+               struct nfs41_impl_id *impl_id = nfss->nfs_client->impl_id;
+               seq_printf(m, "\n\timpl_id:\tname='%s',domain='%s',"
+                          "date='%llu,%u'",
+                          impl_id->name, impl_id->domain,
+                          impl_id->date.seconds, impl_id->date.nseconds);
+       }
+}
  #else
-static void show_pnfs(struct seq_file *m, struct nfs_server *server) {}
+#ifdef CONFIG_NFS_V4
+static void show_pnfs(struct seq_file *m, struct nfs_server *server)
+{
+}
  #endif
+static void show_implementation_id(struct seq_file *m, struct nfs_server *nfss)
+{
+}
  #endif
  
  static int nfs_show_devname(struct seq_file *m, struct dentry *root)
@@ -806,6 +852,8 @@ static int nfs_show_stats(struct seq_file *m, struct dentry *root)
  
         seq_printf(m, "\n\tage:\t%lu", (jiffies - nfss->mount_time) / HZ);
  
+       show_implementation_id(m, nfss);
+
         seq_printf(m, "\n\tcaps:\t");
         seq_printf(m, "caps=0x%x", nfss->caps);
         seq_printf(m, ",wtmult=%u", nfss->wtmult);
@@ -908,6 +956,7 @@ static struct nfs_parsed_mount_data *nfs_alloc_parsed_mount_data(unsigned int ve
                 data->auth_flavor_len   = 1;
                 data->version           = version;
                 data->minorversion      = 0;
+               data->net               = current->nsproxy->net_ns;
                 security_init_mnt_opts(&data->lsm_opts);
         }
         return data;
@@ -1052,6 +1101,40 @@ static int nfs_parse_security_flavors(char *value,
         return 1;
  }
  
+static int nfs_parse_version_string(char *string,
+               struct nfs_parsed_mount_data *mnt,
+               substring_t *args)
+{
+       mnt->flags &= ~NFS_MOUNT_VER3;
+       switch (match_token(string, nfs_vers_tokens, args)) {
+       case Opt_vers_2:
+               mnt->version = 2;
+               break;
+       case Opt_vers_3:
+               mnt->flags |= NFS_MOUNT_VER3;
+               mnt->version = 3;
+               break;
+       case Opt_vers_4:
+               /* Backward compatibility option. In future,
+                * the mount program should always supply
+                * a NFSv4 minor version number.
+                */
+               mnt->version = 4;
+               break;
+       case Opt_vers_4_0:
+               mnt->version = 4;
+               mnt->minorversion = 0;
+               break;
+       case Opt_vers_4_1:
+               mnt->version = 4;
+               mnt->minorversion = 1;
+               break;
+       default:
+               return 0;
+       }
+       return 1;
+}
+
  static int nfs_get_option_str(substring_t args[], char **option)
  {
         kfree(*option);
@@ -1157,18 +1240,6 @@ static int nfs_parse_mount_options(char *raw,
                         mnt->flags |= (NFS_MOUNT_LOCAL_FLOCK |
                                        NFS_MOUNT_LOCAL_FCNTL);
                         break;
-               case Opt_v2:
-                       mnt->flags &= ~NFS_MOUNT_VER3;
-                       mnt->version = 2;
-                       break;
-               case Opt_v3:
-                       mnt->flags |= NFS_MOUNT_VER3;
-                       mnt->version = 3;
-                       break;
-               case Opt_v4:
-                       mnt->flags &= ~NFS_MOUNT_VER3;
-                       mnt->version = 4;
-                       break;
                 case Opt_udp:
                         mnt->flags &= ~NFS_MOUNT_TCP;
                         mnt->nfs_server.protocol = XPRT_TRANSPORT_UDP;
@@ -1295,26 +1366,6 @@ static int nfs_parse_mount_options(char *raw,
                                 goto out_invalid_value;
                         mnt->mount_server.version = option;
                         break;
-               case Opt_nfsvers:
-                       if (nfs_get_option_ul(args, &option))
-                               goto out_invalid_value;
-                       switch (option) {
-                       case NFS2_VERSION:
-                               mnt->flags &= ~NFS_MOUNT_VER3;
-                               mnt->version = 2;
-                               break;
-                       case NFS3_VERSION:
-                               mnt->flags |= NFS_MOUNT_VER3;
-                               mnt->version = 3;
-                               break;
-                       case NFS4_VERSION:
-                               mnt->flags &= ~NFS_MOUNT_VER3;
-                               mnt->version = 4;
-                               break;
-                       default:
-                               goto out_invalid_value;
-                       }
-                       break;
                 case Opt_minorversion:
                         if (nfs_get_option_ul(args, &option))
                                 goto out_invalid_value;
@@ -1326,6 +1377,15 @@ static int nfs_parse_mount_options(char *raw,
                 /*
                  * options that take text values
                  */
+               case Opt_nfsvers:
+                       string = match_strdup(args);
+                       if (string == NULL)
+                               goto out_nomem;
+                       rc = nfs_parse_version_string(string, mnt, args);
+                       kfree(string);
+                       if (!rc)
+                               goto out_invalid_value;
+                       break;
                 case Opt_sec:
                         string = match_strdup(args);
                         if (string == NULL)
@@ -1405,7 +1465,7 @@ static int nfs_parse_mount_options(char *raw,
                         if (string == NULL)
                                 goto out_nomem;
                         mnt->nfs_server.addrlen =
-                               rpc_pton(string, strlen(string),
+                               rpc_pton(mnt->net, string, strlen(string),
                                         (struct sockaddr *)
                                         &mnt->nfs_server.address,
                                         sizeof(mnt->nfs_server.address));
@@ -1427,7 +1487,7 @@ static int nfs_parse_mount_options(char *raw,
                         if (string == NULL)
                                 goto out_nomem;
                         mnt->mount_server.addrlen =
-                               rpc_pton(string, strlen(string),
+                               rpc_pton(mnt->net, string, strlen(string),
                                         (struct sockaddr *)
                                         &mnt->mount_server.address,
                                         sizeof(mnt->mount_server.address));
@@ -1516,6 +1576,9 @@ static int nfs_parse_mount_options(char *raw,
         if (!sloppy && invalid_option)
                 return 0;
  
+       if (mnt->minorversion && mnt->version != 4)
+               goto out_minorversion_mismatch;
+
         /*
          * verify that any proto=/mountproto= options match the address
          * familiies in the addr=/mountaddr= options.
@@ -1549,6 +1612,10 @@ out_invalid_address:
  out_invalid_value:
         printk(KERN_INFO "NFS: bad mount option value specified: %s\n", p);
         return 0;
+out_minorversion_mismatch:
+       printk(KERN_INFO "NFS: mount option vers=%u does not support "
+                        "minorversion=%u\n", mnt->version, mnt->minorversion);
+       return 0;
  out_nomem:
         printk(KERN_INFO "NFS: not enough memory to parse option\n");
         return 0;
@@ -1622,6 +1689,7 @@ static int nfs_try_mount(struct nfs_parsed_mount_data *args,
                 .noresvport     = args->flags & NFS_MOUNT_NORESVPORT,
                 .auth_flav_len  = &server_authlist_len,
                 .auth_flavs     = server_authlist,
+               .net            = args->net,
         };
         int status;
  
@@ -2047,7 +2115,7 @@ static inline void nfs_initialise_sb(struct super_block *sb)
  
         /* We probably want something more informative here */
         snprintf(sb->s_id, sizeof(sb->s_id),
-                "%x:%x", MAJOR(sb->s_dev), MINOR(sb->s_dev));
+                "%u:%u", MAJOR(sb->s_dev), MINOR(sb->s_dev));
  
         if (sb->s_blocksize == 0)
                 sb->s_blocksize = nfs_block_bits(server->wsize,
@@ -2499,12 +2567,6 @@ static int nfs4_validate_text_mount_data(void *options,
                 return -EINVAL;
         }
  
-       if (args->client_address == NULL) {
-               dfprintk(MOUNT,
-                        "NFS4: mount program didn't pass callback address\n");
-               return -EINVAL;
-       }
-
         return nfs_parse_devname(dev_name,
                                    &args->nfs_server.hostname,
                                    NFS4_MAXNAMLEN,
@@ -2663,8 +2725,7 @@ nfs4_remote_mount(struct file_system_type *fs_type, int flags,
         if (!s->s_root) {
                 /* initial superblock/root creation */
                 nfs4_fill_super(s);
-               nfs_fscache_get_super_cookie(
-                       s, data ? data->fscache_uniq : NULL, NULL);
+               nfs_fscache_get_super_cookie(s, data->fscache_uniq, NULL);
         }
  
         mntroot = nfs4_get_root(s, mntfh, dev_name);
diff --git a/fs/nfs/sysctl.c b/fs/nfs/sysctl.c

index 978aaeb8a0936617ed19cde6c5995e247b77af4c..ad4d2e787b2041d17eaacc4a1ce8097f0cc13aca 100644 (file)
--- a/fs/nfs/sysctl.c
+++ b/fs/nfs/sysctl.c
@@ -32,7 +32,6 @@ static ctl_table nfs_cb_sysctls[] = {
                 .extra1 = (int *)&nfs_set_port_min,
                 .extra2 = (int *)&nfs_set_port_max,
         },
-#ifndef CONFIG_NFS_USE_NEW_IDMAPPER
         {
                 .procname = "idmap_cache_timeout",
                 .data = &nfs_idmap_cache_timeout,
@@ -40,7 +39,6 @@ static ctl_table nfs_cb_sysctls[] = {
                 .mode = 0644,
                 .proc_handler = proc_dointvec_jiffies,
         },
-#endif /* CONFIG_NFS_USE_NEW_IDMAPPER */
  #endif
         {
                 .procname       = "nfs_mountpoint_timeout",
diff --git a/fs/nfs/unlink.c b/fs/nfs/unlink.c

index 4f9319a2e5674554d48a01af2189011a5e35346c..3210a03342f924e886e5c7f174c9147de54911a4 100644 (file)
--- a/fs/nfs/unlink.c
+++ b/fs/nfs/unlink.c
@@ -20,15 +20,6 @@
  #include "iostat.h"
  #include "delegation.h"
  
-struct nfs_unlinkdata {
-       struct hlist_node list;
-       struct nfs_removeargs args;
-       struct nfs_removeres res;
-       struct inode *dir;
-       struct rpc_cred *cred;
-       struct nfs_fattr dir_attr;
-};
-
  /**
   * nfs_free_unlinkdata - release data from a sillydelete operation.
   * @data: pointer to unlink structure.
@@ -107,25 +98,16 @@ static void nfs_async_unlink_release(void *calldata)
         nfs_sb_deactive(sb);
  }
  
-#if defined(CONFIG_NFS_V4_1)
-void nfs_unlink_prepare(struct rpc_task *task, void *calldata)
+static void nfs_unlink_prepare(struct rpc_task *task, void *calldata)
  {
         struct nfs_unlinkdata *data = calldata;
-       struct nfs_server *server = NFS_SERVER(data->dir);
-
-       if (nfs4_setup_sequence(server, &data->args.seq_args,
-                               &data->res.seq_res, 1, task))
-               return;
-       rpc_call_start(task);
+       NFS_PROTO(data->dir)->unlink_rpc_prepare(task, data);
  }
-#endif /* CONFIG_NFS_V4_1 */
  
  static const struct rpc_call_ops nfs_unlink_ops = {
         .rpc_call_done = nfs_async_unlink_done,
         .rpc_release = nfs_async_unlink_release,
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_unlink_prepare,
-#endif /* CONFIG_NFS_V4_1 */
  };
  
  static int nfs_do_call_unlink(struct dentry *parent, struct inode *dir, struct nfs_unlinkdata *data)
@@ -341,18 +323,6 @@ nfs_cancel_async_unlink(struct dentry *dentry)
         spin_unlock(&dentry->d_lock);
  }
  
-struct nfs_renamedata {
-       struct nfs_renameargs   args;
-       struct nfs_renameres    res;
-       struct rpc_cred         *cred;
-       struct inode            *old_dir;
-       struct dentry           *old_dentry;
-       struct nfs_fattr        old_fattr;
-       struct inode            *new_dir;
-       struct dentry           *new_dentry;
-       struct nfs_fattr        new_fattr;
-};
-
  /**
   * nfs_async_rename_done - Sillyrename post-processing
   * @task: rpc_task of the sillyrename
@@ -403,25 +373,16 @@ static void nfs_async_rename_release(void *calldata)
         kfree(data);
  }
  
-#if defined(CONFIG_NFS_V4_1)
  static void nfs_rename_prepare(struct rpc_task *task, void *calldata)
  {
         struct nfs_renamedata *data = calldata;
-       struct nfs_server *server = NFS_SERVER(data->old_dir);
-
-       if (nfs4_setup_sequence(server, &data->args.seq_args,
-                               &data->res.seq_res, 1, task))
-               return;
-       rpc_call_start(task);
+       NFS_PROTO(data->old_dir)->rename_rpc_prepare(task, data);
  }
-#endif /* CONFIG_NFS_V4_1 */
  
  static const struct rpc_call_ops nfs_rename_ops = {
         .rpc_call_done = nfs_async_rename_done,
         .rpc_release = nfs_async_rename_release,
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_rename_prepare,
-#endif /* CONFIG_NFS_V4_1 */
  };
  
  /**
diff --git a/fs/nfs/write.c b/fs/nfs/write.c

index 834f0fe96f89f4acf707df504e1a244fc56d466b..2c68818f68ac056b8587c22bf40f8f5d229a6403 100644 (file)
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -100,7 +100,6 @@ void nfs_writedata_free(struct nfs_write_data *p)
  
  void nfs_writedata_release(struct nfs_write_data *wdata)
  {
-       put_lseg(wdata->lseg);
         put_nfs_open_context(wdata->args.context);
         nfs_writedata_free(wdata);
  }
@@ -236,10 +235,10 @@ static struct nfs_page *nfs_find_and_lock_request(struct page *page, bool nonblo
                 req = nfs_page_find_request_locked(page);
                 if (req == NULL)
                         break;
-               if (nfs_set_page_tag_locked(req))
+               if (nfs_lock_request_dontget(req))
                         break;
                 /* Note: If we hold the page lock, as is the case in nfs_writepage,
-                *       then the call to nfs_set_page_tag_locked() will always
+                *       then the call to nfs_lock_request_dontget() will always
                  *       succeed provided that someone hasn't already marked the
                  *       request as dirty (in which case we don't care).
                  */
@@ -375,21 +374,14 @@ out_err:
  /*
   * Insert a write request into an inode
   */
-static int nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
+static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
  {
         struct nfs_inode *nfsi = NFS_I(inode);
-       int error;
-
-       error = radix_tree_preload(GFP_NOFS);
-       if (error != 0)
-               goto out;
  
         /* Lock the request! */
         nfs_lock_request_dontget(req);
  
         spin_lock(&inode->i_lock);
-       error = radix_tree_insert(&nfsi->nfs_page_tree, req->wb_index, req);
-       BUG_ON(error);
         if (!nfsi->npages && nfs_have_delegation(inode, FMODE_WRITE))
                 inode->i_version++;
         set_bit(PG_MAPPED, &req->wb_flags);
@@ -397,12 +389,7 @@ static int nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
         set_page_private(req->wb_page, (unsigned long)req);
         nfsi->npages++;
         kref_get(&req->wb_kref);
-       radix_tree_tag_set(&nfsi->nfs_page_tree, req->wb_index,
-                               NFS_PAGE_TAG_LOCKED);
         spin_unlock(&inode->i_lock);
-       radix_tree_preload_end();
-out:
-       return error;
  }
  
  /*
@@ -419,7 +406,6 @@ static void nfs_inode_remove_request(struct nfs_page *req)
         set_page_private(req->wb_page, 0);
         ClearPagePrivate(req->wb_page);
         clear_bit(PG_MAPPED, &req->wb_flags);
-       radix_tree_delete(&nfsi->nfs_page_tree, req->wb_index);
         nfsi->npages--;
         spin_unlock(&inode->i_lock);
         nfs_release_request(req);
@@ -432,39 +418,90 @@ nfs_mark_request_dirty(struct nfs_page *req)
  }
  
  #if defined(CONFIG_NFS_V3) || defined(CONFIG_NFS_V4)
-/*
- * Add a request to the inode's commit list.
+/**
+ * nfs_request_add_commit_list - add request to a commit list
+ * @req: pointer to a struct nfs_page
+ * @head: commit list head
+ *
+ * This sets the PG_CLEAN bit, updates the inode global count of
+ * number of outstanding requests requiring a commit as well as
+ * the MM page stats.
+ *
+ * The caller must _not_ hold the inode->i_lock, but must be
+ * holding the nfs_page lock.
   */
-static void
-nfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
+void
+nfs_request_add_commit_list(struct nfs_page *req, struct list_head *head)
  {
         struct inode *inode = req->wb_context->dentry->d_inode;
-       struct nfs_inode *nfsi = NFS_I(inode);
  
-       spin_lock(&inode->i_lock);
         set_bit(PG_CLEAN, &(req)->wb_flags);
-       radix_tree_tag_set(&nfsi->nfs_page_tree,
-                       req->wb_index,
-                       NFS_PAGE_TAG_COMMIT);
-       nfsi->ncommit++;
+       spin_lock(&inode->i_lock);
+       nfs_list_add_request(req, head);
+       NFS_I(inode)->ncommit++;
         spin_unlock(&inode->i_lock);
-       pnfs_mark_request_commit(req, lseg);
         inc_zone_page_state(req->wb_page, NR_UNSTABLE_NFS);
         inc_bdi_stat(req->wb_page->mapping->backing_dev_info, BDI_RECLAIMABLE);
         __mark_inode_dirty(inode, I_DIRTY_DATASYNC);
  }
+EXPORT_SYMBOL_GPL(nfs_request_add_commit_list);
  
-static int
+/**
+ * nfs_request_remove_commit_list - Remove request from a commit list
+ * @req: pointer to a nfs_page
+ *
+ * This clears the PG_CLEAN bit, and updates the inode global count of
+ * number of outstanding requests requiring a commit
+ * It does not update the MM page stats.
+ *
+ * The caller _must_ hold the inode->i_lock and the nfs_page lock.
+ */
+void
+nfs_request_remove_commit_list(struct nfs_page *req)
+{
+       struct inode *inode = req->wb_context->dentry->d_inode;
+
+       if (!test_and_clear_bit(PG_CLEAN, &(req)->wb_flags))
+               return;
+       nfs_list_remove_request(req);
+       NFS_I(inode)->ncommit--;
+}
+EXPORT_SYMBOL_GPL(nfs_request_remove_commit_list);
+
+
+/*
+ * Add a request to the inode's commit list.
+ */
+static void
+nfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
+{
+       struct inode *inode = req->wb_context->dentry->d_inode;
+
+       if (pnfs_mark_request_commit(req, lseg))
+               return;
+       nfs_request_add_commit_list(req, &NFS_I(inode)->commit_list);
+}
+
+static void
+nfs_clear_page_commit(struct page *page)
+{
+       dec_zone_page_state(page, NR_UNSTABLE_NFS);
+       dec_bdi_stat(page->mapping->backing_dev_info, BDI_RECLAIMABLE);
+}
+
+static void
  nfs_clear_request_commit(struct nfs_page *req)
  {
-       struct page *page = req->wb_page;
+       if (test_bit(PG_CLEAN, &req->wb_flags)) {
+               struct inode *inode = req->wb_context->dentry->d_inode;
  
-       if (test_and_clear_bit(PG_CLEAN, &(req)->wb_flags)) {
-               dec_zone_page_state(page, NR_UNSTABLE_NFS);
-               dec_bdi_stat(page->mapping->backing_dev_info, BDI_RECLAIMABLE);
-               return 1;
+               if (!pnfs_clear_request_commit(req)) {
+                       spin_lock(&inode->i_lock);
+                       nfs_request_remove_commit_list(req);
+                       spin_unlock(&inode->i_lock);
+               }
+               nfs_clear_page_commit(req->wb_page);
         }
-       return 0;
  }
  
  static inline
@@ -491,15 +528,14 @@ int nfs_reschedule_unstable_write(struct nfs_page *req,
         return 0;
  }
  #else
-static inline void
+static void
  nfs_mark_request_commit(struct nfs_page *req, struct pnfs_layout_segment *lseg)
  {
  }
  
-static inline int
+static void
  nfs_clear_request_commit(struct nfs_page *req)
  {
-       return 0;
  }
  
  static inline
@@ -520,46 +556,65 @@ int nfs_reschedule_unstable_write(struct nfs_page *req,
  static int
  nfs_need_commit(struct nfs_inode *nfsi)
  {
-       return radix_tree_tagged(&nfsi->nfs_page_tree, NFS_PAGE_TAG_COMMIT);
+       return nfsi->ncommit > 0;
+}
+
+/* i_lock held by caller */
+static int
+nfs_scan_commit_list(struct list_head *src, struct list_head *dst, int max,
+               spinlock_t *lock)
+{
+       struct nfs_page *req, *tmp;
+       int ret = 0;
+
+       list_for_each_entry_safe(req, tmp, src, wb_list) {
+               if (!nfs_lock_request(req))
+                       continue;
+               if (cond_resched_lock(lock))
+                       list_safe_reset_next(req, tmp, wb_list);
+               nfs_request_remove_commit_list(req);
+               nfs_list_add_request(req, dst);
+               ret++;
+               if (ret == max)
+                       break;
+       }
+       return ret;
  }
  
  /*
   * nfs_scan_commit - Scan an inode for commit requests
   * @inode: NFS inode to scan
   * @dst: destination list
- * @idx_start: lower bound of page->index to scan.
- * @npages: idx_start + npages sets the upper bound to scan.
   *
   * Moves requests from the inode's 'commit' request list.
   * The requests are *not* checked to ensure that they form a contiguous set.
   */
  static int
-nfs_scan_commit(struct inode *inode, struct list_head *dst, pgoff_t idx_start, unsigned int npages)
+nfs_scan_commit(struct inode *inode, struct list_head *dst)
  {
         struct nfs_inode *nfsi = NFS_I(inode);
-       int ret;
-
-       if (!nfs_need_commit(nfsi))
-               return 0;
+       int ret = 0;
  
         spin_lock(&inode->i_lock);
-       ret = nfs_scan_list(nfsi, dst, idx_start, npages, NFS_PAGE_TAG_COMMIT);
-       if (ret > 0)
-               nfsi->ncommit -= ret;
-       spin_unlock(&inode->i_lock);
-
-       if (nfs_need_commit(NFS_I(inode)))
-               __mark_inode_dirty(inode, I_DIRTY_DATASYNC);
+       if (nfsi->ncommit > 0) {
+               const int max = INT_MAX;
  
+               ret = nfs_scan_commit_list(&nfsi->commit_list, dst, max,
+                               &inode->i_lock);
+               ret += pnfs_scan_commit_lists(inode, max - ret,
+                               &inode->i_lock);
+       }
+       spin_unlock(&inode->i_lock);
         return ret;
  }
+
  #else
  static inline int nfs_need_commit(struct nfs_inode *nfsi)
  {
         return 0;
  }
  
-static inline int nfs_scan_commit(struct inode *inode, struct list_head *dst, pgoff_t idx_start, unsigned int npages)
+static inline int nfs_scan_commit(struct inode *inode, struct list_head *dst)
  {
         return 0;
  }
@@ -604,7 +659,7 @@ static struct nfs_page *nfs_try_to_update_request(struct inode *inode,
                     || end < req->wb_offset)
                         goto out_flushme;
  
-               if (nfs_set_page_tag_locked(req))
+               if (nfs_lock_request_dontget(req))
                         break;
  
                 /* The request is locked, so wait and then retry */
@@ -616,13 +671,6 @@ static struct nfs_page *nfs_try_to_update_request(struct inode *inode,
                 spin_lock(&inode->i_lock);
         }
  
-       if (nfs_clear_request_commit(req) &&
-           radix_tree_tag_clear(&NFS_I(inode)->nfs_page_tree,
-                                req->wb_index, NFS_PAGE_TAG_COMMIT) != NULL) {
-               NFS_I(inode)->ncommit--;
-               pnfs_clear_request_commit(req);
-       }
-
         /* Okay, the request matches. Update the region */
         if (offset < req->wb_offset) {
                 req->wb_offset = offset;
@@ -634,6 +682,7 @@ static struct nfs_page *nfs_try_to_update_request(struct inode *inode,
                 req->wb_bytes = rqend - req->wb_offset;
  out_unlock:
         spin_unlock(&inode->i_lock);
+       nfs_clear_request_commit(req);
         return req;
  out_flushme:
         spin_unlock(&inode->i_lock);
@@ -655,7 +704,6 @@ static struct nfs_page * nfs_setup_write_request(struct nfs_open_context* ctx,
  {
         struct inode *inode = page->mapping->host;
         struct nfs_page *req;
-       int error;
  
         req = nfs_try_to_update_request(inode, page, offset, bytes);
         if (req != NULL)
@@ -663,11 +711,7 @@ static struct nfs_page * nfs_setup_write_request(struct nfs_open_context* ctx,
         req = nfs_create_request(ctx, inode, page, offset, bytes);
         if (IS_ERR(req))
                 goto out;
-       error = nfs_inode_add_request(inode, req);
-       if (error != 0) {
-               nfs_release_request(req);
-               req = ERR_PTR(error);
-       }
+       nfs_inode_add_request(inode, req);
  out:
         return req;
  }
@@ -684,7 +728,7 @@ static int nfs_writepage_setup(struct nfs_open_context *ctx, struct page *page,
         nfs_grow_file(page, offset, count);
         nfs_mark_uptodate(page, req->wb_pgbase, req->wb_bytes);
         nfs_mark_request_dirty(req);
-       nfs_clear_page_tag_locked(req);
+       nfs_unlock_request(req);
         return 0;
  }
  
@@ -777,7 +821,7 @@ static void nfs_writepage_release(struct nfs_page *req,
  
         if (PageError(req->wb_page) || !nfs_reschedule_unstable_write(req, data))
                 nfs_inode_remove_request(req);
-       nfs_clear_page_tag_locked(req);
+       nfs_unlock_request(req);
         nfs_end_page_writeback(page);
  }
  
@@ -925,7 +969,7 @@ static void nfs_redirty_request(struct nfs_page *req)
         struct page *page = req->wb_page;
  
         nfs_mark_request_dirty(req);
-       nfs_clear_page_tag_locked(req);
+       nfs_unlock_request(req);
         nfs_end_page_writeback(page);
  }
  
@@ -1128,23 +1172,14 @@ out:
         nfs_writedata_release(calldata);
  }
  
-#if defined(CONFIG_NFS_V4_1)
  void nfs_write_prepare(struct rpc_task *task, void *calldata)
  {
         struct nfs_write_data *data = calldata;
-
-       if (nfs4_setup_sequence(NFS_SERVER(data->inode),
-                               &data->args.seq_args,
-                               &data->res.seq_res, 1, task))
-               return;
-       rpc_call_start(task);
+       NFS_PROTO(data->inode)->write_rpc_prepare(task, data);
  }
-#endif /* CONFIG_NFS_V4_1 */
  
  static const struct rpc_call_ops nfs_write_partial_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_write_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_writeback_done_partial,
         .rpc_release = nfs_writeback_release_partial,
  };
@@ -1199,16 +1234,14 @@ static void nfs_writeback_release_full(void *calldata)
  remove_request:
                 nfs_inode_remove_request(req);
         next:
-               nfs_clear_page_tag_locked(req);
+               nfs_unlock_request(req);
                 nfs_end_page_writeback(page);
         }
         nfs_writedata_release(calldata);
  }
  
  static const struct rpc_call_ops nfs_write_full_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_write_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_writeback_done_full,
         .rpc_release = nfs_writeback_release_full,
  };
@@ -1325,7 +1358,6 @@ void nfs_commitdata_release(void *data)
  {
         struct nfs_write_data *wdata = data;
  
-       put_lseg(wdata->lseg);
         put_nfs_open_context(wdata->args.context);
         nfs_commit_free(wdata);
  }
@@ -1411,7 +1443,7 @@ void nfs_retry_commit(struct list_head *page_list,
                 dec_zone_page_state(req->wb_page, NR_UNSTABLE_NFS);
                 dec_bdi_stat(req->wb_page->mapping->backing_dev_info,
                              BDI_RECLAIMABLE);
-               nfs_clear_page_tag_locked(req);
+               nfs_unlock_request(req);
         }
  }
  EXPORT_SYMBOL_GPL(nfs_retry_commit);
@@ -1460,7 +1492,7 @@ void nfs_commit_release_pages(struct nfs_write_data *data)
         while (!list_empty(&data->pages)) {
                 req = nfs_list_entry(data->pages.next);
                 nfs_list_remove_request(req);
-               nfs_clear_request_commit(req);
+               nfs_clear_page_commit(req->wb_page);
  
                 dprintk("NFS:       commit (%s/%lld %d@%lld)",
                         req->wb_context->dentry->d_sb->s_id,
@@ -1486,7 +1518,7 @@ void nfs_commit_release_pages(struct nfs_write_data *data)
                 dprintk(" mismatch\n");
                 nfs_mark_request_dirty(req);
         next:
-               nfs_clear_page_tag_locked(req);
+               nfs_unlock_request(req);
         }
  }
  EXPORT_SYMBOL_GPL(nfs_commit_release_pages);
@@ -1501,9 +1533,7 @@ static void nfs_commit_release(void *calldata)
  }
  
  static const struct rpc_call_ops nfs_commit_ops = {
-#if defined(CONFIG_NFS_V4_1)
         .rpc_call_prepare = nfs_write_prepare,
-#endif /* CONFIG_NFS_V4_1 */
         .rpc_call_done = nfs_commit_done,
         .rpc_release = nfs_commit_release,
  };
@@ -1517,7 +1547,7 @@ int nfs_commit_inode(struct inode *inode, int how)
         res = nfs_commit_set_lock(NFS_I(inode), may_wait);
         if (res <= 0)
                 goto out_mark_dirty;
-       res = nfs_scan_commit(inode, &head, 0, 0);
+       res = nfs_scan_commit(inode, &head);
         if (res) {
                 int error;
  
@@ -1635,6 +1665,7 @@ int nfs_wb_page_cancel(struct inode *inode, struct page *page)
                 if (req == NULL)
                         break;
                 if (nfs_lock_request_dontget(req)) {
+                       nfs_clear_request_commit(req);
                         nfs_inode_remove_request(req);
                         /*
                          * In case nfs_inode_remove_request has marked the
diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c

index 6f3ebb48b12fad4532884df525e8942bd3564c90..0e262f32ac415a577793c74bb8cf6e7cd8d9202f 100644 (file)
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -605,24 +605,24 @@ static struct rpc_version nfs_cb_version4 = {
         .procs                  = nfs4_cb_procedures
  };
  
-static struct rpc_version *nfs_cb_version[] = {
+static const struct rpc_version *nfs_cb_version[] = {
         &nfs_cb_version4,
  };
  
-static struct rpc_program cb_program;
+static const struct rpc_program cb_program;
  
  static struct rpc_stat cb_stats = {
         .program                = &cb_program
  };
  
  #define NFS4_CALLBACK 0x40000000
-static struct rpc_program cb_program = {
+static const struct rpc_program cb_program = {
         .name                   = "nfs4_cb",
         .number                 = NFS4_CALLBACK,
         .nrvers                 = ARRAY_SIZE(nfs_cb_version),
         .version                = nfs_cb_version,
         .stats                  = &cb_stats,
-       .pipe_dir_name          = "/nfsd4_cb",
+       .pipe_dir_name          = "nfsd4_cb",
  };
  
  static int max_cb_time(void)
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c

index e8c98f0096706c04e70456c35af2f8123241af7a..c5cddd659429f33b371ea03a3d920808911ce8f1 100644 (file)
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1308,7 +1308,7 @@ gen_callback(struct nfs4_client *clp, struct nfsd4_setclientid *se, struct svc_r
         else
                 goto out_err;
  
-       conn->cb_addrlen = rpc_uaddr2sockaddr(se->se_callback_addr_val,
+       conn->cb_addrlen = rpc_uaddr2sockaddr(&init_net, se->se_callback_addr_val,
                                             se->se_callback_addr_len,
                                             (struct sockaddr *)&conn->cb_addr,
                                             sizeof(conn->cb_addr));
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c

index 748eda93ce590d1ad1e4f7892f29e25f8ad8856a..64c24af8d7eaf40d5436aea2b2a6c44d588102e5 100644 (file)
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -223,7 +223,7 @@ static ssize_t write_unlock_ip(struct file *file, char *buf, size_t size)
         if (qword_get(&buf, fo_path, size) < 0)
                 return -EINVAL;
  
-       if (rpc_pton(fo_path, size, sap, salen) == 0)
+       if (rpc_pton(&init_net, fo_path, size, sap, salen) == 0)
                 return -EINVAL;
  
         return nlmsvc_unlock_all_by_ip(sap);
@@ -722,7 +722,7 @@ static ssize_t __write_ports_addxprt(char *buf)
         nfsd_serv->sv_nrthreads--;
         return 0;
  out_close:
-       xprt = svc_find_xprt(nfsd_serv, transport, PF_INET, port);
+       xprt = svc_find_xprt(nfsd_serv, transport, &init_net, PF_INET, port);
         if (xprt != NULL) {
                 svc_close_xprt(xprt);
                 svc_xprt_put(xprt);
@@ -748,7 +748,7 @@ static ssize_t __write_ports_delxprt(char *buf)
         if (port < 1 || port > USHRT_MAX || nfsd_serv == NULL)
                 return -EINVAL;
  
-       xprt = svc_find_xprt(nfsd_serv, transport, AF_UNSPEC, port);
+       xprt = svc_find_xprt(nfsd_serv, transport, &init_net, AF_UNSPEC, port);
         if (xprt == NULL)
                 return -ENOTCONN;
  
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c

index eda7d7e55e05c45aa309f1465f60368bf3f00242..fce472f5f39e74f2fb9ab36bdf70019f573a73af 100644 (file)
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -251,13 +251,13 @@ static void nfsd_shutdown(void)
         nfsd_up = false;
  }
  
-static void nfsd_last_thread(struct svc_serv *serv)
+static void nfsd_last_thread(struct svc_serv *serv, struct net *net)
  {
         /* When last nfsd thread exits we need to do some clean-up */
         nfsd_serv = NULL;
         nfsd_shutdown();
  
-       svc_rpcb_cleanup(serv);
+       svc_rpcb_cleanup(serv, net);
  
         printk(KERN_WARNING "nfsd: last server has exited, flushing export "
                             "cache\n");
diff --git a/fs/nfsd/stats.c b/fs/nfsd/stats.c

index a2e2402b2afb5a45200b5902fadabcb144d16ebf..6d4521feb6e339729d9e9e8583088fcd1e0ec6f9 100644 (file)
--- a/fs/nfsd/stats.c
+++ b/fs/nfsd/stats.c
@@ -25,6 +25,7 @@
  #include <linux/module.h>
  #include <linux/sunrpc/stats.h>
  #include <linux/nfsd/stats.h>
+#include <net/net_namespace.h>
  
  #include "nfsd.h"
  
@@ -94,11 +95,11 @@ static const struct file_operations nfsd_proc_fops = {
  void
  nfsd_stat_init(void)
  {
-       svc_proc_register(&nfsd_svcstats, &nfsd_proc_fops);
+       svc_proc_register(&init_net, &nfsd_svcstats, &nfsd_proc_fops);
  }
  
  void
  nfsd_stat_shutdown(void)
  {
-       svc_proc_unregister("nfsd");
+       svc_proc_unregister(&init_net, "nfsd");
  }
diff --git a/include/linux/key.h b/include/linux/key.h

index 1600ebf717a79b4259a721e21bd096ec6206b1ed..96933b1e5d24eeee3dde80b5336b0053a0c1d06e 100644 (file)
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -277,6 +277,8 @@ static inline key_serial_t key_serial(const struct key *key)
         return key ? key->serial : 0;
  }
  
+extern void key_set_timeout(struct key *, unsigned);
+
  /**
   * key_is_instantiated - Determine if a key has been positively instantiated
   * @key: The key to check.
diff --git a/include/linux/lockd/bind.h b/include/linux/lockd/bind.h

index fbc48f898521c1a24492c8eb782c12301e341568..11a966e5f829e9d9862589e393c1576780cfed48 100644 (file)
--- a/include/linux/lockd/bind.h
+++ b/include/linux/lockd/bind.h
@@ -42,6 +42,7 @@ struct nlmclnt_initdata {
         unsigned short          protocol;
         u32                     nfs_version;
         int                     noresvport;
+       struct net              *net;
  };
  
  /*
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h

index 88a114fce477ad5a42e5e5c3cb837ae7a8a1384e..f04ce6ac6d04fd84f65c533cb34b5748d2d6665f 100644 (file)
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -67,6 +67,7 @@ struct nlm_host {
         struct list_head        h_reclaim;      /* Locks in RECLAIM state */
         struct nsm_handle       *h_nsmhandle;   /* NSM status handle */
         char                    *h_addrbuf;     /* address eyecatcher */
+       struct net              *net;           /* host net */
  };
  
  /*
@@ -188,7 +189,7 @@ struct nlm_block {
  /*
   * Global variables
   */
-extern struct rpc_program      nlm_program;
+extern const struct rpc_program        nlm_program;
  extern struct svc_procedure    nlmsvc_procedures[];
  #ifdef CONFIG_LOCKD_V4
  extern struct svc_procedure    nlmsvc_procedures4[];
@@ -222,7 +223,8 @@ struct nlm_host  *nlmclnt_lookup_host(const struct sockaddr *sap,
                                         const unsigned short protocol,
                                         const u32 version,
                                         const char *hostname,
-                                       int noresvport);
+                                       int noresvport,
+                                       struct net *net);
  void             nlmclnt_release_host(struct nlm_host *);
  struct nlm_host  *nlmsvc_lookup_host(const struct svc_rqst *rqstp,
                                         const char *hostname,
@@ -232,6 +234,7 @@ struct rpc_clnt * nlm_bind_host(struct nlm_host *);
  void             nlm_rebind_host(struct nlm_host *);
  struct nlm_host * nlm_get_host(struct nlm_host *);
  void             nlm_shutdown_hosts(void);
+void             nlm_shutdown_hosts_net(struct net *net);
  void             nlm_host_rebooted(const struct nlm_reboot *);
  
  /*
diff --git a/include/linux/lockd/xdr4.h b/include/linux/lockd/xdr4.h

index 7353821341edb76bc1e02e646ed01fe32200768b..e58c88b52ce138f638f9adcd0c673b895291bb1d 100644 (file)
--- a/include/linux/lockd/xdr4.h
+++ b/include/linux/lockd/xdr4.h
@@ -42,6 +42,6 @@ int   nlmclt_encode_lockargs(struct rpc_rqst *, u32 *, struct nlm_args *);
  int    nlmclt_encode_cancargs(struct rpc_rqst *, u32 *, struct nlm_args *);
  int    nlmclt_encode_unlockargs(struct rpc_rqst *, u32 *, struct nlm_args *);
   */
-extern struct rpc_version nlm_version4;
+extern const struct rpc_version nlm_version4;
  
  #endif /* LOCKD_XDR4_H */
diff --git a/include/linux/nfs.h b/include/linux/nfs.h

index 8c6ee44914cb4c7dc464167a2163625ac6c6f8f8..6d1fb63f59221690f72d62f1d8a5cae71070fcc5 100644 (file)
--- a/include/linux/nfs.h
+++ b/include/linux/nfs.h
@@ -29,7 +29,7 @@
  #define NFS_MNT_VERSION                1
  #define NFS_MNT3_VERSION       3
  
-#define NFS_PIPE_DIRNAME "/nfs"
+#define NFS_PIPE_DIRNAME "nfs"
  
  /*
   * NFS stats. The good thing with these values is that NFSv3 errors are
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h

index 32345c2805c0588bb2373b63e234359271fc7cfe..834df8bf08b6e54951bc6483f0ffecbc47c7bd28 100644 (file)
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -183,15 +183,12 @@ struct nfs4_acl {
  
  typedef struct { char data[NFS4_VERIFIER_SIZE]; } nfs4_verifier;
  
-struct nfs41_stateid {
+struct nfs_stateid4 {
         __be32 seqid;
         char other[NFS4_STATEID_OTHER_SIZE];
  } __attribute__ ((packed));
  
-typedef union {
-       char data[NFS4_STATEID_SIZE];
-       struct nfs41_stateid stateid;
-} nfs4_stateid;
+typedef struct nfs_stateid4 nfs4_stateid;
  
  enum nfs_opnum4 {
         OP_ACCESS = 3,
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h

index 8c29950d2fa5041497c22614808f682e76b3681c..52a1bdb4ee2bad0a668262c7b67bf8003f738095 100644 (file)
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -38,6 +38,13 @@
  
  #ifdef __KERNEL__
  
+/*
+ * Enable dprintk() debugging support for nfs client.
+ */
+#ifdef CONFIG_NFS_DEBUG
+# define NFS_DEBUG
+#endif
+
  #include <linux/in.h>
  #include <linux/mm.h>
  #include <linux/pagemap.h>
@@ -171,13 +178,9 @@ struct nfs_inode {
          */
         __be32                  cookieverf[2];
  
-       /*
-        * This is the list of dirty unwritten pages.
-        */
-       struct radix_tree_root  nfs_page_tree;
-
         unsigned long           npages;
         unsigned long           ncommit;
+       struct list_head        commit_list;
  
         /* Open contexts for shared mmap writes */
         struct list_head        open_files;
@@ -395,6 +398,29 @@ static inline void nfs_free_fhandle(const struct nfs_fh *fh)
         kfree(fh);
  }
  
+#ifdef NFS_DEBUG
+extern u32 _nfs_display_fhandle_hash(const struct nfs_fh *fh);
+static inline u32 nfs_display_fhandle_hash(const struct nfs_fh *fh)
+{
+       return _nfs_display_fhandle_hash(fh);
+}
+extern void _nfs_display_fhandle(const struct nfs_fh *fh, const char *caption);
+#define nfs_display_fhandle(fh, caption)                       \
+       do {                                                    \
+               if (unlikely(nfs_debug & NFSDBG_FACILITY))      \
+                       _nfs_display_fhandle(fh, caption);      \
+       } while (0)
+#else
+static inline u32 nfs_display_fhandle_hash(const struct nfs_fh *fh)
+{
+       return 0;
+}
+static inline void nfs_display_fhandle(const struct nfs_fh *fh,
+                                      const char *caption)
+{
+}
+#endif
+
  /*
   * linux/fs/nfs/nfsroot.c
   */
@@ -632,19 +658,13 @@ nfs_fileid_to_ino_t(u64 fileid)
  
  #ifdef __KERNEL__
  
-/*
- * Enable debugging support for nfs client.
- * Requires RPC_DEBUG.
- */
-#ifdef RPC_DEBUG
-# define NFS_DEBUG
-#endif
-
  # undef ifdebug
  # ifdef NFS_DEBUG
  #  define ifdebug(fac)         if (unlikely(nfs_debug & NFSDBG_##fac))
+#  define NFS_IFDEBUG(x)       x
  # else
  #  define ifdebug(fac)         if (0)
+#  define NFS_IFDEBUG(x)
  # endif
  #endif /* __KERNEL */
  
diff --git a/include/linux/nfs_fs_i.h b/include/linux/nfs_fs_i.h

index 861730275ba0545feb6857c42f9d161d7f38d47c..a5c50d97341edfac63d04a0f7bad96a5f1bfd0f9 100644 (file)
--- a/include/linux/nfs_fs_i.h
+++ b/include/linux/nfs_fs_i.h
@@ -1,10 +1,6 @@
  #ifndef _NFS_FS_I
  #define _NFS_FS_I
  
-#include <asm/types.h>
-#include <linux/list.h>
-#include <linux/nfs.h>
-
  struct nlm_lockowner;
  
  /*
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h

index ba4d7656ecfde15c98188afee4a086b19746772f..7073fc74481cb6e1d69b0278e26c87c52cbc349e 100644 (file)
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -3,6 +3,7 @@
  
  #include <linux/list.h>
  #include <linux/backing-dev.h>
+#include <linux/idr.h>
  #include <linux/wait.h>
  #include <linux/nfs_xdr.h>
  #include <linux/sunrpc/xprt.h>
@@ -17,6 +18,7 @@ struct nfs4_sequence_res;
  struct nfs_server;
  struct nfs4_minor_version_ops;
  struct server_scope;
+struct nfs41_impl_id;
  
  /*
   * The nfs_client identifies our client state to the server.
@@ -85,6 +87,8 @@ struct nfs_client {
  #endif
  
         struct server_scope     *server_scope;  /* from exchange_id */
+       struct nfs41_impl_id    *impl_id;       /* from exchange_id */
+       struct net              *net;
  };
  
  /*
@@ -144,15 +148,18 @@ struct nfs_server {
         u32                     acl_bitmask;    /* V4 bitmask representing the ACEs
                                                    that are supported on this
                                                    filesystem */
+       u32                     fh_expire_type; /* V4 bitmask representing file
+                                                  handle volatility type for
+                                                  this filesystem */
         struct pnfs_layoutdriver_type  *pnfs_curr_ld; /* Active layout driver */
         struct rpc_wait_queue   roc_rpcwaitq;
         void                    *pnfs_ld_data;  /* per mount point data */
  
         /* the following fields are protected by nfs_client->cl_lock */
         struct rb_root          state_owners;
-       struct rb_root          openowner_id;
-       struct rb_root          lockowner_id;
  #endif
+       struct ida              openowner_id;
+       struct ida              lockowner_id;
         struct list_head        state_owners_lru;
         struct list_head        layouts;
         struct list_head        delegations;
@@ -188,21 +195,23 @@ struct nfs_server {
  
  
  /* maximum number of slots to use */
-#define NFS4_MAX_SLOT_TABLE RPC_MAX_SLOT_TABLE
+#define NFS4_DEF_SLOT_TABLE_SIZE (16U)
+#define NFS4_MAX_SLOT_TABLE (256U)
+#define NFS4_NO_SLOT ((u32)-1)
  
  #if defined(CONFIG_NFS_V4)
  
  /* Sessions */
-#define SLOT_TABLE_SZ (NFS4_MAX_SLOT_TABLE/(8*sizeof(long)))
+#define SLOT_TABLE_SZ DIV_ROUND_UP(NFS4_MAX_SLOT_TABLE, 8*sizeof(long))
  struct nfs4_slot_table {
         struct nfs4_slot *slots;                /* seqid per slot */
         unsigned long   used_slots[SLOT_TABLE_SZ]; /* used/unused bitmap */
         spinlock_t      slot_tbl_lock;
         struct rpc_wait_queue   slot_tbl_waitq; /* allocators may wait here */
-       int             max_slots;              /* # slots in table */
-       int             highest_used_slotid;    /* sent to server on each SEQ.
+       u32             max_slots;              /* # slots in table */
+       u32             highest_used_slotid;    /* sent to server on each SEQ.
                                                  * op for dynamic resizing */
-       int             target_max_slots;       /* Set by CB_RECALL_SLOT as
+       u32             target_max_slots;       /* Set by CB_RECALL_SLOT as
                                                  * the new max_slots */
         struct completion complete;
  };
diff --git a/include/linux/nfs_idmap.h b/include/linux/nfs_idmap.h

index 308c188770185962547e196312aa87f304aa4502..7eed2012d288926a6317e53ead4eb7880796c7ca 100644 (file)
--- a/include/linux/nfs_idmap.h
+++ b/include/linux/nfs_idmap.h
@@ -69,36 +69,22 @@ struct nfs_server;
  struct nfs_fattr;
  struct nfs4_string;
  
-#ifdef CONFIG_NFS_USE_NEW_IDMAPPER
-
+#ifdef CONFIG_NFS_V4
  int nfs_idmap_init(void);
  void nfs_idmap_quit(void);
-
-static inline int nfs_idmap_new(struct nfs_client *clp)
-{
-       return 0;
-}
-
-static inline void nfs_idmap_delete(struct nfs_client *clp)
-{
-}
-
-#else /* CONFIG_NFS_USE_NEW_IDMAPPER not set */
-
+#else
  static inline int nfs_idmap_init(void)
  {
         return 0;
  }
  
  static inline void nfs_idmap_quit(void)
-{
-}
+{}
+#endif
  
  int nfs_idmap_new(struct nfs_client *);
  void nfs_idmap_delete(struct nfs_client *);
  
-#endif /* CONFIG_NFS_USE_NEW_IDMAPPER */
-
  void nfs_fattr_init_names(struct nfs_fattr *fattr,
                 struct nfs4_string *owner_name,
                 struct nfs4_string *group_name);
diff --git a/include/linux/nfs_iostat.h b/include/linux/nfs_iostat.h

index 8866bb3502ee1df94075aa2068a980cb7fe2f090..9dcbbe9a51fb17da49fe9d8e8a4deaadd67ccec1 100644 (file)
--- a/include/linux/nfs_iostat.h
+++ b/include/linux/nfs_iostat.h
@@ -21,7 +21,7 @@
  #ifndef _LINUX_NFS_IOSTAT
  #define _LINUX_NFS_IOSTAT
  
-#define NFS_IOSTAT_VERS                "1.0"
+#define NFS_IOSTAT_VERS                "1.1"
  
  /*
   * NFS byte counters
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h

index ab465fe8c3d6fe2d3e9f204b3a013bd22fb14038..eac30d6bec17c78db77a050e269ae0336e00e372 100644 (file)
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -18,12 +18,6 @@
  
  #include <linux/kref.h>
  
-/*
- * Valid flags for the radix tree
- */
-#define NFS_PAGE_TAG_LOCKED    0
-#define NFS_PAGE_TAG_COMMIT    1
-
  /*
   * Valid flags for a dirty buffer
   */
@@ -33,16 +27,13 @@ enum {
         PG_CLEAN,
         PG_NEED_COMMIT,
         PG_NEED_RESCHED,
-       PG_PNFS_COMMIT,
         PG_PARTIAL_READ_FAILED,
+       PG_COMMIT_TO_DS,
  };
  
  struct nfs_inode;
  struct nfs_page {
-       union {
-               struct list_head        wb_list;        /* Defines state of page: */
-               struct pnfs_layout_segment *wb_commit_lseg; /* Used when PG_PNFS_COMMIT set */
-       };
+       struct list_head        wb_list;        /* Defines state of page: */
         struct page             *wb_page;       /* page to read in/write out */
         struct nfs_open_context *wb_context;    /* File state context info */
         struct nfs_lock_context *wb_lock_context;       /* lock context info */
@@ -90,8 +81,6 @@ extern        struct nfs_page *nfs_create_request(struct nfs_open_context *ctx,
  extern void nfs_release_request(struct nfs_page *req);
  
  
-extern int nfs_scan_list(struct nfs_inode *nfsi, struct list_head *dst,
-                         pgoff_t idx_start, unsigned int npages, int tag);
  extern void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
                              struct inode *inode,
                              const struct nfs_pageio_ops *pg_ops,
@@ -106,8 +95,6 @@ extern bool nfs_generic_pg_test(struct nfs_pageio_descriptor *desc,
                                 struct nfs_page *req);
  extern  int nfs_wait_on_request(struct nfs_page *);
  extern void nfs_unlock_request(struct nfs_page *req);
-extern int nfs_set_page_tag_locked(struct nfs_page *req);
-extern  void nfs_clear_page_tag_locked(struct nfs_page *req);
  
  /*
   * Lock the page of an asynchronous request without getting a new reference
@@ -118,6 +105,16 @@ nfs_lock_request_dontget(struct nfs_page *req)
         return !test_and_set_bit(PG_BUSY, &req->wb_flags);
  }
  
+static inline int
+nfs_lock_request(struct nfs_page *req)
+{
+       if (test_and_set_bit(PG_BUSY, &req->wb_flags))
+               return 0;
+       kref_get(&req->wb_kref);
+       return 1;
+}
+
+
  /**
   * nfs_list_add_request - Insert a request into a list
   * @req: request
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h

index d6ba9a12591ea464991b1d36157918ee5fc1eebe..bfd0d1bf67072e9a5a6f6c143e80f56fcc6238aa 100644 (file)
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -2,7 +2,6 @@
  #define _LINUX_NFS_XDR_H
  
  #include <linux/nfsacl.h>
-#include <linux/nfs3.h>
  #include <linux/sunrpc/gss_api.h>
  
  /*
@@ -89,11 +88,12 @@ struct nfs_fattr {
  #define NFS_ATTR_FATTR_PRECTIME                (1U << 16)
  #define NFS_ATTR_FATTR_CHANGE          (1U << 17)
  #define NFS_ATTR_FATTR_PRECHANGE       (1U << 18)
-#define NFS_ATTR_FATTR_V4_REFERRAL     (1U << 19)      /* NFSv4 referral */
-#define NFS_ATTR_FATTR_MOUNTPOINT      (1U << 20)      /* Treat as mountpoint */
-#define NFS_ATTR_FATTR_MOUNTED_ON_FILEID               (1U << 21)
-#define NFS_ATTR_FATTR_OWNER_NAME      (1U << 22)
-#define NFS_ATTR_FATTR_GROUP_NAME      (1U << 23)
+#define NFS_ATTR_FATTR_V4_LOCATIONS    (1U << 19)
+#define NFS_ATTR_FATTR_V4_REFERRAL     (1U << 20)
+#define NFS_ATTR_FATTR_MOUNTPOINT      (1U << 21)
+#define NFS_ATTR_FATTR_MOUNTED_ON_FILEID (1U << 22)
+#define NFS_ATTR_FATTR_OWNER_NAME      (1U << 23)
+#define NFS_ATTR_FATTR_GROUP_NAME      (1U << 24)
  
  #define NFS_ATTR_FATTR (NFS_ATTR_FATTR_TYPE \
                 | NFS_ATTR_FATTR_MODE \
@@ -182,7 +182,7 @@ struct nfs4_slot {
  
  struct nfs4_sequence_args {
         struct nfs4_session     *sa_session;
-       u8                      sa_slotid;
+       u32                     sa_slotid;
         u8                      sa_cache_this;
  };
  
@@ -977,6 +977,7 @@ struct nfs4_server_caps_res {
         u32                             acl_bitmask;
         u32                             has_links;
         u32                             has_symlinks;
+       u32                             fh_expire_type;
         struct nfs4_sequence_res        seq_res;
  };
  
@@ -1055,14 +1056,6 @@ struct nfstime4 {
  };
  
  #ifdef CONFIG_NFS_V4_1
-struct nfs_impl_id4 {
-       u32             domain_len;
-       char            *domain;
-       u32             name_len;
-       char            *name;
-       struct nfstime4 date;
-};
-
  #define NFS4_EXCHANGE_ID_LEN   (48)
  struct nfs41_exchange_id_args {
         struct nfs_client               *client;
@@ -1083,10 +1076,17 @@ struct server_scope {
         char                            server_scope[NFS4_OPAQUE_LIMIT];
  };
  
+struct nfs41_impl_id {
+       char                            domain[NFS4_OPAQUE_LIMIT + 1];
+       char                            name[NFS4_OPAQUE_LIMIT + 1];
+       struct nfstime4                 date;
+};
+
  struct nfs41_exchange_id_res {
         struct nfs_client               *client;
         u32                             flags;
         struct server_scope             *server_scope;
+       struct nfs41_impl_id            *impl_id;
  };
  
  struct nfs41_create_session_args {
@@ -1192,6 +1192,27 @@ struct nfs_write_data {
         struct page             *page_array[NFS_PAGEVEC_SIZE];
  };
  
+struct nfs_unlinkdata {
+       struct hlist_node list;
+       struct nfs_removeargs args;
+       struct nfs_removeres res;
+       struct inode *dir;
+       struct rpc_cred *cred;
+       struct nfs_fattr dir_attr;
+};
+
+struct nfs_renamedata {
+       struct nfs_renameargs   args;
+       struct nfs_renameres    res;
+       struct rpc_cred         *cred;
+       struct inode            *old_dir;
+       struct dentry           *old_dentry;
+       struct nfs_fattr        old_fattr;
+       struct inode            *new_dir;
+       struct dentry           *new_dentry;
+       struct nfs_fattr        new_fattr;
+};
+
  struct nfs_access_entry;
  struct nfs_client;
  struct rpc_timeout;
@@ -1221,10 +1242,12 @@ struct nfs_rpc_ops {
                             struct iattr *, int, struct nfs_open_context *);
         int     (*remove)  (struct inode *, struct qstr *);
         void    (*unlink_setup)  (struct rpc_message *, struct inode *dir);
+       void    (*unlink_rpc_prepare) (struct rpc_task *, struct nfs_unlinkdata *);
         int     (*unlink_done) (struct rpc_task *, struct inode *);
         int     (*rename)  (struct inode *, struct qstr *,
                             struct inode *, struct qstr *);
         void    (*rename_setup)  (struct rpc_message *msg, struct inode *dir);
+       void    (*rename_rpc_prepare)(struct rpc_task *task, struct nfs_renamedata *);
         int     (*rename_done) (struct rpc_task *task, struct inode *old_dir, struct inode *new_dir);
         int     (*link)    (struct inode *, struct inode *, struct qstr *);
         int     (*symlink) (struct inode *, struct dentry *, struct page *,
@@ -1244,8 +1267,10 @@ struct nfs_rpc_ops {
         int     (*set_capabilities)(struct nfs_server *, struct nfs_fh *);
         int     (*decode_dirent)(struct xdr_stream *, struct nfs_entry *, int);
         void    (*read_setup)   (struct nfs_read_data *, struct rpc_message *);
+       void    (*read_rpc_prepare)(struct rpc_task *, struct nfs_read_data *);
         int     (*read_done)  (struct rpc_task *, struct nfs_read_data *);
         void    (*write_setup)  (struct nfs_write_data *, struct rpc_message *);
+       void    (*write_rpc_prepare)(struct rpc_task *, struct nfs_write_data *);
         int     (*write_done)  (struct rpc_task *, struct nfs_write_data *);
         void    (*commit_setup) (struct nfs_write_data *, struct rpc_message *);
         int     (*commit_done) (struct rpc_task *, struct nfs_write_data *);
@@ -1275,11 +1300,11 @@ struct nfs_rpc_ops {
  extern const struct nfs_rpc_ops        nfs_v2_clientops;
  extern const struct nfs_rpc_ops        nfs_v3_clientops;
  extern const struct nfs_rpc_ops        nfs_v4_clientops;
-extern struct rpc_version      nfs_version2;
-extern struct rpc_version      nfs_version3;
-extern struct rpc_version      nfs_version4;
+extern const struct rpc_version nfs_version2;
+extern const struct rpc_version nfs_version3;
+extern const struct rpc_version nfs_version4;
  
-extern struct rpc_version      nfsacl_version3;
-extern struct rpc_program      nfsacl_program;
+extern const struct rpc_version nfsacl_version3;
+extern const struct rpc_program nfsacl_program;
  
  #endif
diff --git a/include/linux/sunrpc/auth.h b/include/linux/sunrpc/auth.h

index 7874a8a566386a02165ebc0bff474d8bf160f0c0..492a36d72829939c21f6c4ff3f5bd412956b2a95 100644 (file)
--- a/include/linux/sunrpc/auth.h
+++ b/include/linux/sunrpc/auth.h
@@ -99,6 +99,8 @@ struct rpc_authops {
  
         struct rpc_cred *       (*lookup_cred)(struct rpc_auth *, struct auth_cred *, int);
         struct rpc_cred *       (*crcreate)(struct rpc_auth*, struct auth_cred *, int);
+       int                     (*pipes_create)(struct rpc_auth *);
+       void                    (*pipes_destroy)(struct rpc_auth *);
  };
  
  struct rpc_credops {
diff --git a/include/linux/sunrpc/bc_xprt.h b/include/linux/sunrpc/bc_xprt.h

index f7f3ce340c083f04b311f06aac1e8e34239c8016..969c0a671dbfbbab399f7472d501a93d75bbd8fc 100644 (file)
--- a/include/linux/sunrpc/bc_xprt.h
+++ b/include/linux/sunrpc/bc_xprt.h
@@ -35,7 +35,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  struct rpc_rqst *xprt_alloc_bc_request(struct rpc_xprt *xprt);
  void xprt_free_bc_request(struct rpc_rqst *req);
  int xprt_setup_backchannel(struct rpc_xprt *, unsigned int min_reqs);
-void xprt_destroy_backchannel(struct rpc_xprt *, int max_reqs);
+void xprt_destroy_backchannel(struct rpc_xprt *, unsigned int max_reqs);
  int bc_send(struct rpc_rqst *req);
  
  /*
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h

index 57531f8e5956dd07c83652789bbb9d7a8a40e399..f5fd6160dbca396835773609586e65071f9c78ab 100644 (file)
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -117,6 +117,7 @@ struct cache_detail {
                 struct cache_detail_procfs procfs;
                 struct cache_detail_pipefs pipefs;
         } u;
+       struct net              *net;
  };
  
  
@@ -197,11 +198,14 @@ extern void cache_flush(void);
  extern void cache_purge(struct cache_detail *detail);
  #define NEVER (0x7FFFFFFF)
  extern void __init cache_initialize(void);
-extern int cache_register(struct cache_detail *cd);
  extern int cache_register_net(struct cache_detail *cd, struct net *net);
-extern void cache_unregister(struct cache_detail *cd);
  extern void cache_unregister_net(struct cache_detail *cd, struct net *net);
  
+extern struct cache_detail *cache_create_net(struct cache_detail *tmpl, struct net *net);
+extern void cache_destroy_net(struct cache_detail *cd, struct net *net);
+
+extern void sunrpc_init_cache_detail(struct cache_detail *cd);
+extern void sunrpc_destroy_cache_detail(struct cache_detail *cd);
  extern int sunrpc_cache_register_pipefs(struct dentry *parent, const char *,
                                         umode_t, struct cache_detail *);
  extern void sunrpc_cache_unregister_pipefs(struct cache_detail *);
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h

index 2c5993a17c3315423cbb3ba895fb8f24f4ac0bb3..523547ecfee2812c0f8d7fa069f7c8ac3f66ab2d 100644 (file)
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -35,14 +35,13 @@ struct rpc_clnt {
         struct list_head        cl_clients;     /* Global list of clients */
         struct list_head        cl_tasks;       /* List of tasks */
         spinlock_t              cl_lock;        /* spinlock */
-       struct rpc_xprt *       cl_xprt;        /* transport */
+       struct rpc_xprt __rcu * cl_xprt;        /* transport */
         struct rpc_procinfo *   cl_procinfo;    /* procedure info */
         u32                     cl_prog,        /* RPC program number */
                                 cl_vers,        /* RPC version number */
                                 cl_maxproc;     /* max procedure number */
  
-       char *                  cl_server;      /* server machine name */
-       char *                  cl_protname;    /* protocol name */
+       const char *            cl_protname;    /* protocol name */
         struct rpc_auth *       cl_auth;        /* authenticator */
         struct rpc_stat *       cl_stats;       /* per-program statistics */
         struct rpc_iostats *    cl_metrics;     /* per-client statistics */
@@ -57,12 +56,11 @@ struct rpc_clnt {
  
         int                     cl_nodelen;     /* nodename length */
         char                    cl_nodename[UNX_MAXNODENAME];
-       struct path             cl_path;
+       struct dentry *         cl_dentry;
         struct rpc_clnt *       cl_parent;      /* Points to parent of clones */
         struct rpc_rtt          cl_rtt_default;
         struct rpc_timeout      cl_timeout_default;
-       struct rpc_program *    cl_program;
-       char                    cl_inline_name[32];
+       const struct rpc_program *cl_program;
         char                    *cl_principal;  /* target to authenticate to */
  };
  
@@ -71,12 +69,12 @@ struct rpc_clnt {
   */
  #define RPC_MAXVERSION         4
  struct rpc_program {
-       char *                  name;           /* protocol name */
+       const char *            name;           /* protocol name */
         u32                     number;         /* program number */
         unsigned int            nrvers;         /* number of versions */
-       struct rpc_version **   version;        /* version array */
+       const struct rpc_version **     version;        /* version array */
         struct rpc_stat *       stats;          /* statistics */
-       char *                  pipe_dir_name;  /* path to rpc_pipefs dir */
+       const char *            pipe_dir_name;  /* path to rpc_pipefs dir */
  };
  
  struct rpc_version {
@@ -97,7 +95,7 @@ struct rpc_procinfo {
         unsigned int            p_count;        /* call count */
         unsigned int            p_timer;        /* Which RTT timer to use */
         u32                     p_statidx;      /* Which procedure to account */
-       char *                  p_name;         /* name of procedure */
+       const char *            p_name;         /* name of procedure */
  };
  
  #ifdef __KERNEL__
@@ -109,8 +107,8 @@ struct rpc_create_args {
         size_t                  addrsize;
         struct sockaddr         *saddress;
         const struct rpc_timeout *timeout;
-       char                    *servername;
-       struct rpc_program      *program;
+       const char              *servername;
+       const struct rpc_program *program;
         u32                     prognumber;     /* overrides program->number */
         u32                     version;
         rpc_authflavor_t        authflavor;
@@ -129,17 +127,18 @@ struct rpc_create_args {
  
  struct rpc_clnt *rpc_create(struct rpc_create_args *args);
  struct rpc_clnt        *rpc_bind_new_program(struct rpc_clnt *,
-                               struct rpc_program *, u32);
+                               const struct rpc_program *, u32);
  void rpc_task_reset_client(struct rpc_task *task, struct rpc_clnt *clnt);
  struct rpc_clnt *rpc_clone_client(struct rpc_clnt *);
  void           rpc_shutdown_client(struct rpc_clnt *);
  void           rpc_release_client(struct rpc_clnt *);
  void           rpc_task_release_client(struct rpc_task *);
  
-int            rpcb_create_local(void);
-void           rpcb_put_local(void);
-int            rpcb_register(u32, u32, int, unsigned short);
-int            rpcb_v4_register(const u32 program, const u32 version,
+int            rpcb_create_local(struct net *);
+void           rpcb_put_local(struct net *);
+int            rpcb_register(struct net *, u32, u32, int, unsigned short);
+int            rpcb_v4_register(struct net *net, const u32 program,
+                                const u32 version,
                                  const struct sockaddr *address,
                                  const char *netid);
  void           rpcb_getport_async(struct rpc_task *);
@@ -156,16 +155,19 @@ struct rpc_task *rpc_call_null(struct rpc_clnt *clnt, struct rpc_cred *cred,
  int            rpc_restart_call_prepare(struct rpc_task *);
  int            rpc_restart_call(struct rpc_task *);
  void           rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int);
+int            rpc_protocol(struct rpc_clnt *);
+struct net *   rpc_net_ns(struct rpc_clnt *);
  size_t         rpc_max_payload(struct rpc_clnt *);
  void           rpc_force_rebind(struct rpc_clnt *);
  size_t         rpc_peeraddr(struct rpc_clnt *, struct sockaddr *, size_t);
  const char     *rpc_peeraddr2str(struct rpc_clnt *, enum rpc_display_format_t);
+int            rpc_localaddr(struct rpc_clnt *, struct sockaddr *, size_t);
  
  size_t         rpc_ntop(const struct sockaddr *, char *, const size_t);
-size_t         rpc_pton(const char *, const size_t,
+size_t         rpc_pton(struct net *, const char *, const size_t,
                          struct sockaddr *, const size_t);
  char *         rpc_sockaddr2uaddr(const struct sockaddr *, gfp_t);
-size_t         rpc_uaddr2sockaddr(const char *, const size_t,
+size_t         rpc_uaddr2sockaddr(struct net *, const char *, const size_t,
                                    struct sockaddr *, const size_t);
  
  static inline unsigned short rpc_get_port(const struct sockaddr *sap)
diff --git a/include/linux/sunrpc/debug.h b/include/linux/sunrpc/debug.h

index c2786f20016f9184f0849c2ca9e94636efeaff9e..a76cc20d98ce21531a6fcbd70e861bbfc5055c12 100644 (file)
--- a/include/linux/sunrpc/debug.h
+++ b/include/linux/sunrpc/debug.h
@@ -31,9 +31,12 @@
  /*
   * Enable RPC debugging/profiling.
   */
-#ifdef CONFIG_SYSCTL
+#ifdef CONFIG_SUNRPC_DEBUG
  #define  RPC_DEBUG
  #endif
+#ifdef CONFIG_TRACEPOINTS
+#define RPC_TRACEPOINTS
+#endif
  /* #define  RPC_PROFILE */
  
  /*
@@ -47,15 +50,32 @@ extern unsigned int         nlm_debug;
  #endif
  
  #define dprintk(args...)       dfprintk(FACILITY, ## args)
+#define dprintk_rcu(args...)   dfprintk_rcu(FACILITY, ## args)
  
  #undef ifdebug
  #ifdef RPC_DEBUG                       
  # define ifdebug(fac)          if (unlikely(rpc_debug & RPCDBG_##fac))
-# define dfprintk(fac, args...)        do { ifdebug(fac) printk(args); } while(0)
+
+# define dfprintk(fac, args...)        \
+       do { \
+               ifdebug(fac) \
+                       printk(KERN_DEFAULT args); \
+       } while (0)
+
+# define dfprintk_rcu(fac, args...)    \
+       do { \
+               ifdebug(fac) { \
+                       rcu_read_lock(); \
+                       printk(KERN_DEFAULT args); \
+                       rcu_read_unlock(); \
+               } \
+       } while (0)
+
  # define RPC_IFDEBUG(x)                x
  #else
  # define ifdebug(fac)          if (0)
-# define dfprintk(fac, args...)        do ; while (0)
+# define dfprintk(fac, args...)        do {} while (0)
+# define dfprintk_rcu(fac, args...)    do {} while (0)
  # define RPC_IFDEBUG(x)
  #endif
  
diff --git a/include/linux/sunrpc/metrics.h b/include/linux/sunrpc/metrics.h

index b6edbc0ea83dddcdc450fa05a06cc6face7d69b5..1565bbe86d51e77c2323ca26a04bbad8a35b9f22 100644 (file)
--- a/include/linux/sunrpc/metrics.h
+++ b/include/linux/sunrpc/metrics.h
@@ -74,14 +74,16 @@ struct rpc_clnt;
  #ifdef CONFIG_PROC_FS
  
  struct rpc_iostats *   rpc_alloc_iostats(struct rpc_clnt *);
-void                   rpc_count_iostats(struct rpc_task *);
+void                   rpc_count_iostats(const struct rpc_task *,
+                                         struct rpc_iostats *);
  void                   rpc_print_iostats(struct seq_file *, struct rpc_clnt *);
  void                   rpc_free_iostats(struct rpc_iostats *);
  
  #else  /*  CONFIG_PROC_FS  */
  
  static inline struct rpc_iostats *rpc_alloc_iostats(struct rpc_clnt *clnt) { return NULL; }
-static inline void rpc_count_iostats(struct rpc_task *task) {}
+static inline void rpc_count_iostats(const struct rpc_task *task,
+                                    struct rpc_iostats *stats) {}
  static inline void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt) {}
  static inline void rpc_free_iostats(struct rpc_iostats *stats) {}
  
diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h

index 2bb03d77375a26fb30c00fd1846f232a73e809bf..a7b422b33eda65583d175ccd494b58d23c7716ec 100644 (file)
--- a/include/linux/sunrpc/rpc_pipe_fs.h
+++ b/include/linux/sunrpc/rpc_pipe_fs.h
@@ -21,21 +21,26 @@ struct rpc_pipe_ops {
         void (*destroy_msg)(struct rpc_pipe_msg *);
  };
  
-struct rpc_inode {
-       struct inode vfs_inode;
-       void *private;
+struct rpc_pipe {
         struct list_head pipe;
         struct list_head in_upcall;
         struct list_head in_downcall;
         int pipelen;
         int nreaders;
         int nwriters;
-       int nkern_readwriters;
-       wait_queue_head_t waitq;
  #define RPC_PIPE_WAIT_FOR_OPEN 1
         int flags;
         struct delayed_work queue_timeout;
         const struct rpc_pipe_ops *ops;
+       spinlock_t lock;
+       struct dentry *dentry;
+};
+
+struct rpc_inode {
+       struct inode vfs_inode;
+       void *private;
+       struct rpc_pipe *pipe;
+       wait_queue_head_t waitq;
  };
  
  static inline struct rpc_inode *
@@ -44,9 +49,28 @@ RPC_I(struct inode *inode)
         return container_of(inode, struct rpc_inode, vfs_inode);
  }
  
+enum {
+       SUNRPC_PIPEFS_NFS_PRIO,
+       SUNRPC_PIPEFS_RPC_PRIO,
+};
+
+extern int rpc_pipefs_notifier_register(struct notifier_block *);
+extern void rpc_pipefs_notifier_unregister(struct notifier_block *);
+
+enum {
+       RPC_PIPEFS_MOUNT,
+       RPC_PIPEFS_UMOUNT,
+};
+
+extern struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
+                                     const unsigned char *dir_name);
+extern void rpc_pipefs_init_net(struct net *net);
+extern struct super_block *rpc_get_sb_net(const struct net *net);
+extern void rpc_put_sb_net(const struct net *net);
+
  extern ssize_t rpc_pipe_generic_upcall(struct file *, struct rpc_pipe_msg *,
                                        char __user *, size_t);
-extern int rpc_queue_upcall(struct inode *, struct rpc_pipe_msg *);
+extern int rpc_queue_upcall(struct rpc_pipe *, struct rpc_pipe_msg *);
  
  struct rpc_clnt;
  extern struct dentry *rpc_create_client_dir(struct dentry *, struct qstr *, struct rpc_clnt *);
@@ -59,11 +83,13 @@ extern struct dentry *rpc_create_cache_dir(struct dentry *,
                                            struct cache_detail *);
  extern void rpc_remove_cache_dir(struct dentry *);
  
-extern struct dentry *rpc_mkpipe(struct dentry *, const char *, void *,
-                                const struct rpc_pipe_ops *, int flags);
+extern int rpc_rmdir(struct dentry *dentry);
+
+struct rpc_pipe *rpc_mkpipe_data(const struct rpc_pipe_ops *ops, int flags);
+void rpc_destroy_pipe_data(struct rpc_pipe *pipe);
+extern struct dentry *rpc_mkpipe_dentry(struct dentry *, const char *, void *,
+                                       struct rpc_pipe *);
  extern int rpc_unlink(struct dentry *);
-extern struct vfsmount *rpc_get_mount(void);
-extern void rpc_put_mount(void);
  extern int register_rpc_pipefs(void);
  extern void unregister_rpc_pipefs(void);
  
diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h

index e7756896f3ca292f4398304f7761b691232df131..dc0c3cc3ada3f8ced03b7fb00b2b1772bb722034 100644 (file)
--- a/include/linux/sunrpc/sched.h
+++ b/include/linux/sunrpc/sched.h
@@ -103,6 +103,7 @@ typedef void                        (*rpc_action)(struct rpc_task *);
  struct rpc_call_ops {
         void (*rpc_call_prepare)(struct rpc_task *, void *);
         void (*rpc_call_done)(struct rpc_task *, void *);
+       void (*rpc_count_stats)(struct rpc_task *, void *);
         void (*rpc_release)(void *);
  };
  
@@ -195,7 +196,7 @@ struct rpc_wait_queue {
         unsigned char           nr;                     /* # tasks remaining for cookie */
         unsigned short          qlen;                   /* total # tasks waiting in queue */
         struct rpc_timer        timer_list;
-#ifdef RPC_DEBUG
+#if defined(RPC_DEBUG) || defined(RPC_TRACEPOINTS)
         const char *            name;
  #endif
  };
@@ -235,6 +236,9 @@ void                rpc_wake_up_queued_task(struct rpc_wait_queue *,
                                         struct rpc_task *);
  void           rpc_wake_up(struct rpc_wait_queue *);
  struct rpc_task *rpc_wake_up_next(struct rpc_wait_queue *);
+struct rpc_task *rpc_wake_up_first(struct rpc_wait_queue *,
+                                       bool (*)(struct rpc_task *, void *),
+                                       void *);
  void           rpc_wake_up_status(struct rpc_wait_queue *, int);
  int            rpc_queue_empty(struct rpc_wait_queue *);
  void           rpc_delay(struct rpc_task *, unsigned long);
@@ -244,7 +248,8 @@ int         rpciod_up(void);
  void           rpciod_down(void);
  int            __rpc_wait_for_completion_task(struct rpc_task *task, int (*)(void *));
  #ifdef RPC_DEBUG
-void           rpc_show_tasks(void);
+struct net;
+void           rpc_show_tasks(struct net *);
  #endif
  int            rpc_init_mempool(void);
  void           rpc_destroy_mempool(void);
@@ -266,11 +271,22 @@ static inline int rpc_task_has_priority(struct rpc_task *task, unsigned char pri
         return (task->tk_priority + RPC_PRIORITY_LOW == prio);
  }
  
-#ifdef RPC_DEBUG
-static inline const char * rpc_qname(struct rpc_wait_queue *q)
+#if defined(RPC_DEBUG) || defined (RPC_TRACEPOINTS)
+static inline const char * rpc_qname(const struct rpc_wait_queue *q)
  {
         return ((q && q->name) ? q->name : "unknown");
  }
+
+static inline void rpc_assign_waitqueue_name(struct rpc_wait_queue *q,
+               const char *name)
+{
+       q->name = name;
+}
+#else
+static inline void rpc_assign_waitqueue_name(struct rpc_wait_queue *q,
+               const char *name)
+{
+}
  #endif
  
  #endif /* _LINUX_SUNRPC_SCHED_H_ */
diff --git a/include/linux/sunrpc/stats.h b/include/linux/sunrpc/stats.h

index 680471d1f28a4300700883489c3b8b78335c07da..edc64219f92b1153b5f0c2517d06fa46d1228ea8 100644 (file)
--- a/include/linux/sunrpc/stats.h
+++ b/include/linux/sunrpc/stats.h
@@ -12,7 +12,7 @@
  #include <linux/proc_fs.h>
  
  struct rpc_stat {
-       struct rpc_program *    program;
+       const struct rpc_program *program;
  
         unsigned int            netcnt,
                                 netudpcnt,
@@ -58,24 +58,24 @@ void                        rpc_modcount(struct inode *, int);
  #endif
  
  #ifdef CONFIG_PROC_FS
-struct proc_dir_entry *        rpc_proc_register(struct rpc_stat *);
-void                   rpc_proc_unregister(const char *);
-void                   rpc_proc_zero(struct rpc_program *);
-struct proc_dir_entry *        svc_proc_register(struct svc_stat *,
+struct proc_dir_entry *        rpc_proc_register(struct net *,struct rpc_stat *);
+void                   rpc_proc_unregister(struct net *,const char *);
+void                   rpc_proc_zero(const struct rpc_program *);
+struct proc_dir_entry *        svc_proc_register(struct net *, struct svc_stat *,
                                           const struct file_operations *);
-void                   svc_proc_unregister(const char *);
+void                   svc_proc_unregister(struct net *, const char *);
  
  void                   svc_seq_show(struct seq_file *,
                                      const struct svc_stat *);
  #else
  
-static inline struct proc_dir_entry *rpc_proc_register(struct rpc_stat *s) { return NULL; }
-static inline void rpc_proc_unregister(const char *p) {}
-static inline void rpc_proc_zero(struct rpc_program *p) {}
+static inline struct proc_dir_entry *rpc_proc_register(struct net *net, struct rpc_stat *s) { return NULL; }
+static inline void rpc_proc_unregister(struct net *net, const char *p) {}
+static inline void rpc_proc_zero(const struct rpc_program *p) {}
  
-static inline struct proc_dir_entry *svc_proc_register(struct svc_stat *s,
+static inline struct proc_dir_entry *svc_proc_register(struct net *net, struct svc_stat *s,
                                                        const struct file_operations *f) { return NULL; }
-static inline void svc_proc_unregister(const char *p) {}
+static inline void svc_proc_unregister(struct net *net, const char *p) {}
  
  static inline void svc_seq_show(struct seq_file *seq,
                                 const struct svc_stat *st) {}
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h

index 35b37b1e9299e37825fe79a281a29055aa4d273a..51b29ac45a8e7b26583df0217ab37a0d939ad6da 100644 (file)
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -84,7 +84,8 @@ struct svc_serv {
         unsigned int            sv_nrpools;     /* number of thread pools */
         struct svc_pool *       sv_pools;       /* array of thread pools */
  
-       void                    (*sv_shutdown)(struct svc_serv *serv);
+       void                    (*sv_shutdown)(struct svc_serv *serv,
+                                              struct net *net);
                                                 /* Callback to use when last thread
                                                  * exits.
                                                  */
@@ -413,22 +414,24 @@ struct svc_procedure {
  /*
   * Function prototypes.
   */
-void svc_rpcb_cleanup(struct svc_serv *serv);
+int svc_rpcb_setup(struct svc_serv *serv, struct net *net);
+void svc_rpcb_cleanup(struct svc_serv *serv, struct net *net);
  struct svc_serv *svc_create(struct svc_program *, unsigned int,
-                           void (*shutdown)(struct svc_serv *));
+                           void (*shutdown)(struct svc_serv *, struct net *net));
  struct svc_rqst *svc_prepare_thread(struct svc_serv *serv,
                                         struct svc_pool *pool, int node);
  void              svc_exit_thread(struct svc_rqst *);
  struct svc_serv *  svc_create_pooled(struct svc_program *, unsigned int,
-                       void (*shutdown)(struct svc_serv *),
+                       void (*shutdown)(struct svc_serv *, struct net *net),
                         svc_thread_fn, struct module *);
  int               svc_set_num_threads(struct svc_serv *, struct svc_pool *, int);
  int               svc_pool_stats_open(struct svc_serv *serv, struct file *file);
  void              svc_destroy(struct svc_serv *);
+void              svc_shutdown_net(struct svc_serv *, struct net *);
  int               svc_process(struct svc_rqst *);
  int               bc_svc_process(struct svc_serv *, struct rpc_rqst *,
                         struct svc_rqst *);
-int               svc_register(const struct svc_serv *, const int,
+int               svc_register(const struct svc_serv *, struct net *, const int,
                                 const unsigned short, const unsigned short);
  
  void              svc_wake_up(struct svc_serv *);
diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h

index dfa900948af79a6fbd277bbbb5b56c0c7842cec4..b3f64b12f1415f7e14d1f49cdeca4ccf7de20f0f 100644 (file)
--- a/include/linux/sunrpc/svc_xprt.h
+++ b/include/linux/sunrpc/svc_xprt.h
@@ -121,7 +121,8 @@ void        svc_close_xprt(struct svc_xprt *xprt);
  int    svc_port_is_privileged(struct sockaddr *sin);
  int    svc_print_xprts(char *buf, int maxlen);
  struct svc_xprt *svc_find_xprt(struct svc_serv *serv, const char *xcl_name,
-                       const sa_family_t af, const unsigned short port);
+                       struct net *net, const sa_family_t af,
+                       const unsigned short port);
  int    svc_xprt_names(struct svc_serv *serv, char *buf, const int buflen);
  
  static inline void svc_xprt_get(struct svc_xprt *xprt)
diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h

index 25d333c1b5717f3ee9f659168d805865ab420bd5..548790e9113b317dbc8de0c46a691df3c0030269 100644 (file)
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -135,6 +135,9 @@ extern void svcauth_unix_purge(void);
  extern void svcauth_unix_info_release(struct svc_xprt *xpt);
  extern int svcauth_unix_set_client(struct svc_rqst *rqstp);
  
+extern int unix_gid_cache_create(struct net *net);
+extern void unix_gid_cache_destroy(struct net *net);
+
  static inline unsigned long hash_str(char *name, int bits)
  {
         unsigned long hash = 0;
diff --git a/include/linux/sunrpc/svcauth_gss.h b/include/linux/sunrpc/svcauth_gss.h

index 83bbee3f089cd7bd4491b5dd5edadc9f5e4c5207..7c32daa025eb07b644d8185a27c8ea10d8b7c55f 100644 (file)
--- a/include/linux/sunrpc/svcauth_gss.h
+++ b/include/linux/sunrpc/svcauth_gss.h
@@ -18,6 +18,8 @@
  
  int gss_svc_init(void);
  void gss_svc_shutdown(void);
+int gss_svc_init_net(struct net *net);
+void gss_svc_shutdown_net(struct net *net);
  int svcauth_gss_register_pseudoflavor(u32 pseudoflavor, char * name);
  u32 svcauth_gss_flavor(struct auth_domain *dom);
  char *svc_gss_principal(struct svc_rqst *);
diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h

index c84e9741cb2a25471838c2c31503b8d550c8bfbf..cb4ac69e1f3356aeb96cdeed3ca83e106b45ba25 100644 (file)
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -34,7 +34,7 @@ struct svc_sock {
  /*
   * Function prototypes.
   */
-void           svc_close_all(struct svc_serv *);
+void           svc_close_net(struct svc_serv *, struct net *);
  int            svc_recv(struct svc_rqst *, long);
  int            svc_send(struct svc_rqst *);
  void           svc_drop(struct svc_rqst *);
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h

index 15518a152ac3db6e973a95685d50d200e6a1f475..77d278defa70667011b5a8b08ba70394b23d1d84 100644 (file)
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -21,8 +21,8 @@
  
  #define RPC_MIN_SLOT_TABLE     (2U)
  #define RPC_DEF_SLOT_TABLE     (16U)
-#define RPC_MAX_SLOT_TABLE     (128U)
  #define RPC_MAX_SLOT_TABLE_LIMIT       (65536U)
+#define RPC_MAX_SLOT_TABLE     RPC_MAX_SLOT_TABLE_LIMIT
  
  /*
   * This describes a timeout strategy
@@ -219,13 +219,17 @@ struct rpc_xprt {
                                         connect_time,   /* jiffies waiting for connect */
                                         sends,          /* how many complete requests */
                                         recvs,          /* how many complete requests */
-                                       bad_xids;       /* lookup_rqst didn't find XID */
+                                       bad_xids,       /* lookup_rqst didn't find XID */
+                                       max_slots;      /* max rpc_slots used */
  
                 unsigned long long      req_u,          /* average requests on the wire */
-                                       bklog_u;        /* backlog queue utilization */
+                                       bklog_u,        /* backlog queue utilization */
+                                       sending_u,      /* send q utilization */
+                                       pending_u;      /* pend q utilization */
         } stat;
  
         struct net              *xprt_net;
+       const char              *servername;
         const char              *address_strings[RPC_DISPLAY_MAX];
  };
  
@@ -255,6 +259,7 @@ struct xprt_create {
         struct sockaddr *       srcaddr;        /* optional local address */
         struct sockaddr *       dstaddr;        /* remote peer address */
         size_t                  addrlen;
+       const char              *servername;
         struct svc_xprt         *bc_xprt;       /* NFSv4.1 backchannel */
  };
  
diff --git a/include/linux/sunrpc/xprtsock.h b/include/linux/sunrpc/xprtsock.h

index 3f14a02e9cc022dc5b4d276068110e62ff48b52a..1ad36cc25b2ed53756145457fa86dbbe6cacc009 100644 (file)
--- a/include/linux/sunrpc/xprtsock.h
+++ b/include/linux/sunrpc/xprtsock.h
@@ -12,18 +12,6 @@
  int            init_socket_xprt(void);
  void           cleanup_socket_xprt(void);
  
-/*
- * RPC slot table sizes for UDP, TCP transports
- */
-extern unsigned int xprt_udp_slot_table_entries;
-extern unsigned int xprt_tcp_slot_table_entries;
-
-/*
- * Parameters for choosing a free port
- */
-extern unsigned int xprt_min_resvport;
-extern unsigned int xprt_max_resvport;
-
  #define RPC_MIN_RESVPORT       (1U)
  #define RPC_MAX_RESVPORT       (65535U)
  #define RPC_DEF_MIN_RESVPORT   (665U)
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h

new file mode 100644 (file)

index 0000000..43be87d
--- /dev/null
+++ b/include/trace/events/sunrpc.h
@@ -0,0 +1,177 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM sunrpc
+
+#if !defined(_TRACE_SUNRPC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_SUNRPC_H
+
+#include <linux/sunrpc/sched.h>
+#include <linux/sunrpc/clnt.h>
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(rpc_task_status,
+
+       TP_PROTO(struct rpc_task *task),
+
+       TP_ARGS(task),
+
+       TP_STRUCT__entry(
+               __field(const struct rpc_task *, task)
+               __field(const struct rpc_clnt *, clnt)
+               __field(int, status)
+       ),
+
+       TP_fast_assign(
+               __entry->task = task;
+               __entry->clnt = task->tk_client;
+               __entry->status = task->tk_status;
+       ),
+
+       TP_printk("task:%p@%p, status %d",__entry->task, __entry->clnt, __entry->status)
+);
+
+DEFINE_EVENT(rpc_task_status, rpc_call_status,
+       TP_PROTO(struct rpc_task *task),
+
+       TP_ARGS(task)
+);
+
+DEFINE_EVENT(rpc_task_status, rpc_bind_status,
+       TP_PROTO(struct rpc_task *task),
+
+       TP_ARGS(task)
+);
+
+TRACE_EVENT(rpc_connect_status,
+       TP_PROTO(struct rpc_task *task, int status),
+
+       TP_ARGS(task, status),
+
+       TP_STRUCT__entry(
+               __field(const struct rpc_task *, task)
+               __field(const struct rpc_clnt *, clnt)
+               __field(int, status)
+       ),
+
+       TP_fast_assign(
+               __entry->task = task;
+               __entry->clnt = task->tk_client;
+               __entry->status = status;
+       ),
+
+       TP_printk("task:%p@%p, status %d",__entry->task, __entry->clnt, __entry->status)
+);
+
+DECLARE_EVENT_CLASS(rpc_task_running,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const void *action),
+
+       TP_ARGS(clnt, task, action),
+
+       TP_STRUCT__entry(
+               __field(const struct rpc_clnt *, clnt)
+               __field(const struct rpc_task *, task)
+               __field(const void *, action)
+               __field(unsigned long, runstate)
+               __field(int, status)
+               __field(unsigned short, flags)
+               ),
+
+       TP_fast_assign(
+               __entry->clnt = clnt;
+               __entry->task = task;
+               __entry->action = action;
+               __entry->runstate = task->tk_runstate;
+               __entry->status = task->tk_status;
+               __entry->flags = task->tk_flags;
+               ),
+
+       TP_printk("task:%p@%p flags=%4.4x state=%4.4lx status=%d action=%pf",
+               __entry->task,
+               __entry->clnt,
+               __entry->flags,
+               __entry->runstate,
+               __entry->status,
+               __entry->action
+               )
+);
+
+DEFINE_EVENT(rpc_task_running, rpc_task_begin,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const void *action),
+
+       TP_ARGS(clnt, task, action)
+
+);
+
+DEFINE_EVENT(rpc_task_running, rpc_task_run_action,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const void *action),
+
+       TP_ARGS(clnt, task, action)
+
+);
+
+DEFINE_EVENT(rpc_task_running, rpc_task_complete,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const void *action),
+
+       TP_ARGS(clnt, task, action)
+
+);
+
+DECLARE_EVENT_CLASS(rpc_task_queued,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const struct rpc_wait_queue *q),
+
+       TP_ARGS(clnt, task, q),
+
+       TP_STRUCT__entry(
+               __field(const struct rpc_clnt *, clnt)
+               __field(const struct rpc_task *, task)
+               __field(unsigned long, timeout)
+               __field(unsigned long, runstate)
+               __field(int, status)
+               __field(unsigned short, flags)
+               __string(q_name, rpc_qname(q))
+               ),
+
+       TP_fast_assign(
+               __entry->clnt = clnt;
+               __entry->task = task;
+               __entry->timeout = task->tk_timeout;
+               __entry->runstate = task->tk_runstate;
+               __entry->status = task->tk_status;
+               __entry->flags = task->tk_flags;
+               __assign_str(q_name, rpc_qname(q));
+               ),
+
+       TP_printk("task:%p@%p flags=%4.4x state=%4.4lx status=%d timeout=%lu queue=%s",
+               __entry->task,
+               __entry->clnt,
+               __entry->flags,
+               __entry->runstate,
+               __entry->status,
+               __entry->timeout,
+               __get_str(q_name)
+               )
+);
+
+DEFINE_EVENT(rpc_task_queued, rpc_task_sleep,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const struct rpc_wait_queue *q),
+
+       TP_ARGS(clnt, task, q)
+
+);
+
+DEFINE_EVENT(rpc_task_queued, rpc_task_wakeup,
+
+       TP_PROTO(const struct rpc_clnt *clnt, const struct rpc_task *task, const struct rpc_wait_queue *q),
+
+       TP_ARGS(clnt, task, q)
+
+);
+
+#endif /* _TRACE_SUNRPC_H */
+
+#include <trace/define_trace.h>
diff --git a/net/sunrpc/Kconfig b/net/sunrpc/Kconfig

index ffd243d09188dee10d2dffa34d642f7c941b9013..9fe8857d8d596e5eb59146416396b834b195b2c1 100644 (file)
--- a/net/sunrpc/Kconfig
+++ b/net/sunrpc/Kconfig
@@ -39,3 +39,16 @@ config RPCSEC_GSS_KRB5
           Kerberos support should be installed.
  
           If unsure, say Y.
+
+config SUNRPC_DEBUG
+       bool "RPC: Enable dprintk debugging"
+       depends on SUNRPC && SYSCTL
+       help
+         This option enables a sysctl-based debugging interface
+         that is be used by the 'rpcdebug' utility to turn on or off
+         logging of different aspects of the kernel RPC activity.
+
+         Disabling this option will make your kernel slightly smaller,
+         but makes troubleshooting NFS issues significantly harder.
+
+         If unsure, say Y.
diff --git a/net/sunrpc/addr.c b/net/sunrpc/addr.c

index ee77742e0ed6d3dcc89401808028b0228a666aaf..d11418f97f1fa58cb04b3badb1a03d21b91af7a5 100644 (file)
--- a/net/sunrpc/addr.c
+++ b/net/sunrpc/addr.c
@@ -156,8 +156,9 @@ static size_t rpc_pton4(const char *buf, const size_t buflen,
  }
  
  #if IS_ENABLED(CONFIG_IPV6)
-static int rpc_parse_scope_id(const char *buf, const size_t buflen,
-                             const char *delim, struct sockaddr_in6 *sin6)
+static int rpc_parse_scope_id(struct net *net, const char *buf,
+                             const size_t buflen, const char *delim,
+                             struct sockaddr_in6 *sin6)
  {
         char *p;
         size_t len;
@@ -177,7 +178,7 @@ static int rpc_parse_scope_id(const char *buf, const size_t buflen,
                 unsigned long scope_id = 0;
                 struct net_device *dev;
  
-               dev = dev_get_by_name(&init_net, p);
+               dev = dev_get_by_name(net, p);
                 if (dev != NULL) {
                         scope_id = dev->ifindex;
                         dev_put(dev);
@@ -197,7 +198,7 @@ static int rpc_parse_scope_id(const char *buf, const size_t buflen,
         return 0;
  }
  
-static size_t rpc_pton6(const char *buf, const size_t buflen,
+static size_t rpc_pton6(struct net *net, const char *buf, const size_t buflen,
                         struct sockaddr *sap, const size_t salen)
  {
         struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)sap;
@@ -213,14 +214,14 @@ static size_t rpc_pton6(const char *buf, const size_t buflen,
         if (in6_pton(buf, buflen, addr, IPV6_SCOPE_DELIMITER, &delim) == 0)
                 return 0;
  
-       if (!rpc_parse_scope_id(buf, buflen, delim, sin6))
+       if (!rpc_parse_scope_id(net, buf, buflen, delim, sin6))
                 return 0;
  
         sin6->sin6_family = AF_INET6;
         return sizeof(struct sockaddr_in6);
  }
  #else
-static size_t rpc_pton6(const char *buf, const size_t buflen,
+static size_t rpc_pton6(struct net *net, const char *buf, const size_t buflen,
                         struct sockaddr *sap, const size_t salen)
  {
         return 0;
@@ -229,6 +230,7 @@ static size_t rpc_pton6(const char *buf, const size_t buflen,
  
  /**
   * rpc_pton - Construct a sockaddr in @sap
+ * @net: applicable network namespace
   * @buf: C string containing presentation format IP address
   * @buflen: length of presentation address in bytes
   * @sap: buffer into which to plant socket address
@@ -241,14 +243,14 @@ static size_t rpc_pton6(const char *buf, const size_t buflen,
   * socket address, if successful.  Returns zero if an error
   * occurred.
   */
-size_t rpc_pton(const char *buf, const size_t buflen,
+size_t rpc_pton(struct net *net, const char *buf, const size_t buflen,
                 struct sockaddr *sap, const size_t salen)
  {
         unsigned int i;
  
         for (i = 0; i < buflen; i++)
                 if (buf[i] == ':')
-                       return rpc_pton6(buf, buflen, sap, salen);
+                       return rpc_pton6(net, buf, buflen, sap, salen);
         return rpc_pton4(buf, buflen, sap, salen);
  }
  EXPORT_SYMBOL_GPL(rpc_pton);
@@ -295,6 +297,7 @@ char *rpc_sockaddr2uaddr(const struct sockaddr *sap, gfp_t gfp_flags)
  
  /**
   * rpc_uaddr2sockaddr - convert a universal address to a socket address.
+ * @net: applicable network namespace
   * @uaddr: C string containing universal address to convert
   * @uaddr_len: length of universal address string
   * @sap: buffer into which to plant socket address
@@ -306,8 +309,9 @@ char *rpc_sockaddr2uaddr(const struct sockaddr *sap, gfp_t gfp_flags)
   * Returns the size of the socket address if successful; otherwise
   * zero is returned.
   */
-size_t rpc_uaddr2sockaddr(const char *uaddr, const size_t uaddr_len,
-                         struct sockaddr *sap, const size_t salen)
+size_t rpc_uaddr2sockaddr(struct net *net, const char *uaddr,
+                         const size_t uaddr_len, struct sockaddr *sap,
+                         const size_t salen)
  {
         char *c, buf[RPCBIND_MAXUADDRLEN + sizeof('\0')];
         unsigned long portlo, porthi;
@@ -339,7 +343,7 @@ size_t rpc_uaddr2sockaddr(const char *uaddr, const size_t uaddr_len,
         port = (unsigned short)((porthi << 8) | portlo);
  
         *c = '\0';
-       if (rpc_pton(buf, strlen(buf), sap, salen) == 0)
+       if (rpc_pton(net, buf, strlen(buf), sap, salen) == 0)
                 return 0;
  
         switch (sap->sa_family) {
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c

index affa631ac1abe3d42f75635a30e1f066dccbfb44..d3ad81f8da5b79551c36b17a7d53007406946699 100644 (file)
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -81,7 +81,7 @@ struct gss_auth {
          * mechanism (for example, "krb5") and exists for
          * backwards-compatibility with older gssd's.
          */
-       struct dentry *dentry[2];
+       struct rpc_pipe *pipe[2];
  };
  
  /* pipe_version >= 0 if and only if someone has a pipe open. */
@@ -112,7 +112,7 @@ gss_put_ctx(struct gss_cl_ctx *ctx)
  /* gss_cred_set_ctx:
   * called by gss_upcall_callback and gss_create_upcall in order
   * to set the gss context. The actual exchange of an old context
- * and a new one is protected by the inode->i_lock.
+ * and a new one is protected by the pipe->lock.
   */
  static void
  gss_cred_set_ctx(struct rpc_cred *cred, struct gss_cl_ctx *ctx)
@@ -251,7 +251,7 @@ struct gss_upcall_msg {
         struct rpc_pipe_msg msg;
         struct list_head list;
         struct gss_auth *auth;
-       struct rpc_inode *inode;
+       struct rpc_pipe *pipe;
         struct rpc_wait_queue rpc_waitqueue;
         wait_queue_head_t waitqueue;
         struct gss_cl_ctx *ctx;
@@ -294,10 +294,10 @@ gss_release_msg(struct gss_upcall_msg *gss_msg)
  }
  
  static struct gss_upcall_msg *
-__gss_find_upcall(struct rpc_inode *rpci, uid_t uid)
+__gss_find_upcall(struct rpc_pipe *pipe, uid_t uid)
  {
         struct gss_upcall_msg *pos;
-       list_for_each_entry(pos, &rpci->in_downcall, list) {
+       list_for_each_entry(pos, &pipe->in_downcall, list) {
                 if (pos->uid != uid)
                         continue;
                 atomic_inc(&pos->count);
@@ -315,18 +315,17 @@ __gss_find_upcall(struct rpc_inode *rpci, uid_t uid)
  static inline struct gss_upcall_msg *
  gss_add_msg(struct gss_upcall_msg *gss_msg)
  {
-       struct rpc_inode *rpci = gss_msg->inode;
-       struct inode *inode = &rpci->vfs_inode;
+       struct rpc_pipe *pipe = gss_msg->pipe;
         struct gss_upcall_msg *old;
  
-       spin_lock(&inode->i_lock);
-       old = __gss_find_upcall(rpci, gss_msg->uid);
+       spin_lock(&pipe->lock);
+       old = __gss_find_upcall(pipe, gss_msg->uid);
         if (old == NULL) {
                 atomic_inc(&gss_msg->count);
-               list_add(&gss_msg->list, &rpci->in_downcall);
+               list_add(&gss_msg->list, &pipe->in_downcall);
         } else
                 gss_msg = old;
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
         return gss_msg;
  }
  
@@ -342,14 +341,14 @@ __gss_unhash_msg(struct gss_upcall_msg *gss_msg)
  static void
  gss_unhash_msg(struct gss_upcall_msg *gss_msg)
  {
-       struct inode *inode = &gss_msg->inode->vfs_inode;
+       struct rpc_pipe *pipe = gss_msg->pipe;
  
         if (list_empty(&gss_msg->list))
                 return;
-       spin_lock(&inode->i_lock);
+       spin_lock(&pipe->lock);
         if (!list_empty(&gss_msg->list))
                 __gss_unhash_msg(gss_msg);
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
  }
  
  static void
@@ -376,11 +375,11 @@ gss_upcall_callback(struct rpc_task *task)
         struct gss_cred *gss_cred = container_of(task->tk_rqstp->rq_cred,
                         struct gss_cred, gc_base);
         struct gss_upcall_msg *gss_msg = gss_cred->gc_upcall;
-       struct inode *inode = &gss_msg->inode->vfs_inode;
+       struct rpc_pipe *pipe = gss_msg->pipe;
  
-       spin_lock(&inode->i_lock);
+       spin_lock(&pipe->lock);
         gss_handle_downcall_result(gss_cred, gss_msg);
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
         task->tk_status = gss_msg->msg.errno;
         gss_release_msg(gss_msg);
  }
@@ -450,7 +449,7 @@ gss_alloc_msg(struct gss_auth *gss_auth, struct rpc_clnt *clnt,
                 kfree(gss_msg);
                 return ERR_PTR(vers);
         }
-       gss_msg->inode = RPC_I(gss_auth->dentry[vers]->d_inode);
+       gss_msg->pipe = gss_auth->pipe[vers];
         INIT_LIST_HEAD(&gss_msg->list);
         rpc_init_wait_queue(&gss_msg->rpc_waitqueue, "RPCSEC_GSS upcall waitq");
         init_waitqueue_head(&gss_msg->waitqueue);
@@ -474,8 +473,7 @@ gss_setup_upcall(struct rpc_clnt *clnt, struct gss_auth *gss_auth, struct rpc_cr
                 return gss_new;
         gss_msg = gss_add_msg(gss_new);
         if (gss_msg == gss_new) {
-               struct inode *inode = &gss_new->inode->vfs_inode;
-               int res = rpc_queue_upcall(inode, &gss_new->msg);
+               int res = rpc_queue_upcall(gss_new->pipe, &gss_new->msg);
                 if (res) {
                         gss_unhash_msg(gss_new);
                         gss_msg = ERR_PTR(res);
@@ -506,7 +504,7 @@ gss_refresh_upcall(struct rpc_task *task)
         struct gss_cred *gss_cred = container_of(cred,
                         struct gss_cred, gc_base);
         struct gss_upcall_msg *gss_msg;
-       struct inode *inode;
+       struct rpc_pipe *pipe;
         int err = 0;
  
         dprintk("RPC: %5u gss_refresh_upcall for uid %u\n", task->tk_pid,
@@ -524,8 +522,8 @@ gss_refresh_upcall(struct rpc_task *task)
                 err = PTR_ERR(gss_msg);
                 goto out;
         }
-       inode = &gss_msg->inode->vfs_inode;
-       spin_lock(&inode->i_lock);
+       pipe = gss_msg->pipe;
+       spin_lock(&pipe->lock);
         if (gss_cred->gc_upcall != NULL)
                 rpc_sleep_on(&gss_cred->gc_upcall->rpc_waitqueue, task, NULL);
         else if (gss_msg->ctx == NULL && gss_msg->msg.errno >= 0) {
@@ -538,7 +536,7 @@ gss_refresh_upcall(struct rpc_task *task)
                 gss_handle_downcall_result(gss_cred, gss_msg);
                 err = gss_msg->msg.errno;
         }
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
         gss_release_msg(gss_msg);
  out:
         dprintk("RPC: %5u gss_refresh_upcall for uid %u result %d\n",
@@ -549,7 +547,7 @@ out:
  static inline int
  gss_create_upcall(struct gss_auth *gss_auth, struct gss_cred *gss_cred)
  {
-       struct inode *inode;
+       struct rpc_pipe *pipe;
         struct rpc_cred *cred = &gss_cred->gc_base;
         struct gss_upcall_msg *gss_msg;
         DEFINE_WAIT(wait);
@@ -573,14 +571,14 @@ retry:
                 err = PTR_ERR(gss_msg);
                 goto out;
         }
-       inode = &gss_msg->inode->vfs_inode;
+       pipe = gss_msg->pipe;
         for (;;) {
                 prepare_to_wait(&gss_msg->waitqueue, &wait, TASK_KILLABLE);
-               spin_lock(&inode->i_lock);
+               spin_lock(&pipe->lock);
                 if (gss_msg->ctx != NULL || gss_msg->msg.errno < 0) {
                         break;
                 }
-               spin_unlock(&inode->i_lock);
+               spin_unlock(&pipe->lock);
                 if (fatal_signal_pending(current)) {
                         err = -ERESTARTSYS;
                         goto out_intr;
@@ -591,7 +589,7 @@ retry:
                 gss_cred_set_ctx(cred, gss_msg->ctx);
         else
                 err = gss_msg->msg.errno;
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
  out_intr:
         finish_wait(&gss_msg->waitqueue, &wait);
         gss_release_msg(gss_msg);
@@ -609,7 +607,7 @@ gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
         const void *p, *end;
         void *buf;
         struct gss_upcall_msg *gss_msg;
-       struct inode *inode = filp->f_path.dentry->d_inode;
+       struct rpc_pipe *pipe = RPC_I(filp->f_dentry->d_inode)->pipe;
         struct gss_cl_ctx *ctx;
         uid_t uid;
         ssize_t err = -EFBIG;
@@ -639,14 +637,14 @@ gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
  
         err = -ENOENT;
         /* Find a matching upcall */
-       spin_lock(&inode->i_lock);
-       gss_msg = __gss_find_upcall(RPC_I(inode), uid);
+       spin_lock(&pipe->lock);
+       gss_msg = __gss_find_upcall(pipe, uid);
         if (gss_msg == NULL) {
-               spin_unlock(&inode->i_lock);
+               spin_unlock(&pipe->lock);
                 goto err_put_ctx;
         }
         list_del_init(&gss_msg->list);
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
  
         p = gss_fill_context(p, end, ctx, gss_msg->auth->mech);
         if (IS_ERR(p)) {
@@ -674,9 +672,9 @@ gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
         err = mlen;
  
  err_release_msg:
-       spin_lock(&inode->i_lock);
+       spin_lock(&pipe->lock);
         __gss_unhash_msg(gss_msg);
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
         gss_release_msg(gss_msg);
  err_put_ctx:
         gss_put_ctx(ctx);
@@ -722,23 +720,23 @@ static int gss_pipe_open_v1(struct inode *inode)
  static void
  gss_pipe_release(struct inode *inode)
  {
-       struct rpc_inode *rpci = RPC_I(inode);
+       struct rpc_pipe *pipe = RPC_I(inode)->pipe;
         struct gss_upcall_msg *gss_msg;
  
  restart:
-       spin_lock(&inode->i_lock);
-       list_for_each_entry(gss_msg, &rpci->in_downcall, list) {
+       spin_lock(&pipe->lock);
+       list_for_each_entry(gss_msg, &pipe->in_downcall, list) {
  
                 if (!list_empty(&gss_msg->msg.list))
                         continue;
                 gss_msg->msg.errno = -EPIPE;
                 atomic_inc(&gss_msg->count);
                 __gss_unhash_msg(gss_msg);
-               spin_unlock(&inode->i_lock);
+               spin_unlock(&pipe->lock);
                 gss_release_msg(gss_msg);
                 goto restart;
         }
-       spin_unlock(&inode->i_lock);
+       spin_unlock(&pipe->lock);
  
         put_pipe_version();
  }
@@ -759,6 +757,75 @@ gss_pipe_destroy_msg(struct rpc_pipe_msg *msg)
         }
  }
  
+static void gss_pipes_dentries_destroy(struct rpc_auth *auth)
+{
+       struct gss_auth *gss_auth;
+
+       gss_auth = container_of(auth, struct gss_auth, rpc_auth);
+       if (gss_auth->pipe[0]->dentry)
+               rpc_unlink(gss_auth->pipe[0]->dentry);
+       if (gss_auth->pipe[1]->dentry)
+               rpc_unlink(gss_auth->pipe[1]->dentry);
+}
+
+static int gss_pipes_dentries_create(struct rpc_auth *auth)
+{
+       int err;
+       struct gss_auth *gss_auth;
+       struct rpc_clnt *clnt;
+
+       gss_auth = container_of(auth, struct gss_auth, rpc_auth);
+       clnt = gss_auth->client;
+
+       gss_auth->pipe[1]->dentry = rpc_mkpipe_dentry(clnt->cl_dentry,
+                                                     "gssd",
+                                                     clnt, gss_auth->pipe[1]);
+       if (IS_ERR(gss_auth->pipe[1]->dentry))
+               return PTR_ERR(gss_auth->pipe[1]->dentry);
+       gss_auth->pipe[0]->dentry = rpc_mkpipe_dentry(clnt->cl_dentry,
+                                                     gss_auth->mech->gm_name,
+                                                     clnt, gss_auth->pipe[0]);
+       if (IS_ERR(gss_auth->pipe[0]->dentry)) {
+               err = PTR_ERR(gss_auth->pipe[0]->dentry);
+               goto err_unlink_pipe_1;
+       }
+       return 0;
+
+err_unlink_pipe_1:
+       rpc_unlink(gss_auth->pipe[1]->dentry);
+       return err;
+}
+
+static void gss_pipes_dentries_destroy_net(struct rpc_clnt *clnt,
+                                          struct rpc_auth *auth)
+{
+       struct net *net = rpc_net_ns(clnt);
+       struct super_block *sb;
+
+       sb = rpc_get_sb_net(net);
+       if (sb) {
+               if (clnt->cl_dentry)
+                       gss_pipes_dentries_destroy(auth);
+               rpc_put_sb_net(net);
+       }
+}
+
+static int gss_pipes_dentries_create_net(struct rpc_clnt *clnt,
+                                        struct rpc_auth *auth)
+{
+       struct net *net = rpc_net_ns(clnt);
+       struct super_block *sb;
+       int err = 0;
+
+       sb = rpc_get_sb_net(net);
+       if (sb) {
+               if (clnt->cl_dentry)
+                       err = gss_pipes_dentries_create(auth);
+               rpc_put_sb_net(net);
+       }
+       return err;
+}
+
  /*
   * NOTE: we have the opportunity to use different
   * parameters based on the input flavor (which must be a pseudoflavor)
@@ -801,32 +868,33 @@ gss_create(struct rpc_clnt *clnt, rpc_authflavor_t flavor)
          * that we supported only the old pipe.  So we instead create
          * the new pipe first.
          */
-       gss_auth->dentry[1] = rpc_mkpipe(clnt->cl_path.dentry,
-                                        "gssd",
-                                        clnt, &gss_upcall_ops_v1,
-                                        RPC_PIPE_WAIT_FOR_OPEN);
-       if (IS_ERR(gss_auth->dentry[1])) {
-               err = PTR_ERR(gss_auth->dentry[1]);
+       gss_auth->pipe[1] = rpc_mkpipe_data(&gss_upcall_ops_v1,
+                                           RPC_PIPE_WAIT_FOR_OPEN);
+       if (IS_ERR(gss_auth->pipe[1])) {
+               err = PTR_ERR(gss_auth->pipe[1]);
                 goto err_put_mech;
         }
  
-       gss_auth->dentry[0] = rpc_mkpipe(clnt->cl_path.dentry,
-                                        gss_auth->mech->gm_name,
-                                        clnt, &gss_upcall_ops_v0,
-                                        RPC_PIPE_WAIT_FOR_OPEN);
-       if (IS_ERR(gss_auth->dentry[0])) {
-               err = PTR_ERR(gss_auth->dentry[0]);
-               goto err_unlink_pipe_1;
+       gss_auth->pipe[0] = rpc_mkpipe_data(&gss_upcall_ops_v0,
+                                           RPC_PIPE_WAIT_FOR_OPEN);
+       if (IS_ERR(gss_auth->pipe[0])) {
+               err = PTR_ERR(gss_auth->pipe[0]);
+               goto err_destroy_pipe_1;
         }
+       err = gss_pipes_dentries_create_net(clnt, auth);
+       if (err)
+               goto err_destroy_pipe_0;
         err = rpcauth_init_credcache(auth);
         if (err)
-               goto err_unlink_pipe_0;
+               goto err_unlink_pipes;
  
         return auth;
-err_unlink_pipe_0:
-       rpc_unlink(gss_auth->dentry[0]);
-err_unlink_pipe_1:
-       rpc_unlink(gss_auth->dentry[1]);
+err_unlink_pipes:
+       gss_pipes_dentries_destroy_net(clnt, auth);
+err_destroy_pipe_0:
+       rpc_destroy_pipe_data(gss_auth->pipe[0]);
+err_destroy_pipe_1:
+       rpc_destroy_pipe_data(gss_auth->pipe[1]);
  err_put_mech:
         gss_mech_put(gss_auth->mech);
  err_free:
@@ -839,8 +907,9 @@ out_dec:
  static void
  gss_free(struct gss_auth *gss_auth)
  {
-       rpc_unlink(gss_auth->dentry[1]);
-       rpc_unlink(gss_auth->dentry[0]);
+       gss_pipes_dentries_destroy_net(gss_auth->client, &gss_auth->rpc_auth);
+       rpc_destroy_pipe_data(gss_auth->pipe[0]);
+       rpc_destroy_pipe_data(gss_auth->pipe[1]);
         gss_mech_put(gss_auth->mech);
  
         kfree(gss_auth);
@@ -1547,7 +1616,9 @@ static const struct rpc_authops authgss_ops = {
         .create         = gss_create,
         .destroy        = gss_destroy,
         .lookup_cred    = gss_lookup_cred,
-       .crcreate       = gss_create_cred
+       .crcreate       = gss_create_cred,
+       .pipes_create   = gss_pipes_dentries_create,
+       .pipes_destroy  = gss_pipes_dentries_destroy,
  };
  
  static const struct rpc_credops gss_credops = {
@@ -1591,6 +1662,21 @@ static const struct rpc_pipe_ops gss_upcall_ops_v1 = {
         .release_pipe   = gss_pipe_release,
  };
  
+static __net_init int rpcsec_gss_init_net(struct net *net)
+{
+       return gss_svc_init_net(net);
+}
+
+static __net_exit void rpcsec_gss_exit_net(struct net *net)
+{
+       gss_svc_shutdown_net(net);
+}
+
+static struct pernet_operations rpcsec_gss_net_ops = {
+       .init = rpcsec_gss_init_net,
+       .exit = rpcsec_gss_exit_net,
+};
+
  /*
   * Initialize RPCSEC_GSS module
   */
@@ -1604,8 +1690,13 @@ static int __init init_rpcsec_gss(void)
         err = gss_svc_init();
         if (err)
                 goto out_unregister;
+       err = register_pernet_subsys(&rpcsec_gss_net_ops);
+       if (err)
+               goto out_svc_exit;
         rpc_init_wait_queue(&pipe_version_rpc_waitqueue, "gss pipe version");
         return 0;
+out_svc_exit:
+       gss_svc_shutdown();
  out_unregister:
         rpcauth_unregister(&authgss_ops);
  out:
@@ -1614,6 +1705,7 @@ out:
  
  static void __exit exit_rpcsec_gss(void)
  {
+       unregister_pernet_subsys(&rpcsec_gss_net_ops);
         gss_svc_shutdown();
         rpcauth_unregister(&authgss_ops);
         rcu_barrier(); /* Wait for completion of call_rcu()'s */
diff --git a/net/sunrpc/auth_gss/gss_krb5_crypto.c b/net/sunrpc/auth_gss/gss_krb5_crypto.c

index 9576f35ab7014f2c506dfdd306d4201e3535b7a7..0f43e894bc0a47e913ca5999afc69d392cc6e6ad 100644 (file)
--- a/net/sunrpc/auth_gss/gss_krb5_crypto.c
+++ b/net/sunrpc/auth_gss/gss_krb5_crypto.c
@@ -600,11 +600,14 @@ gss_krb5_cts_crypt(struct crypto_blkcipher *cipher, struct xdr_buf *buf,
         u32 ret;
         struct scatterlist sg[1];
         struct blkcipher_desc desc = { .tfm = cipher, .info = iv };
-       u8 data[crypto_blkcipher_blocksize(cipher) * 2];
+       u8 data[GSS_KRB5_MAX_BLOCKSIZE * 2];
         struct page **save_pages;
         u32 len = buf->len - offset;
  
-       BUG_ON(len > crypto_blkcipher_blocksize(cipher) * 2);
+       if (len > ARRAY_SIZE(data)) {
+               WARN_ON(0);
+               return -ENOMEM;
+       }
  
         /*
          * For encryption, we want to read from the cleartext
diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c b/net/sunrpc/auth_gss/gss_krb5_mech.c

index 8c67890de427e35a01e2777db805d0366bea0c8c..8eff8c32d1b9b403c2365326c16e44df7c0923e6 100644 (file)
--- a/net/sunrpc/auth_gss/gss_krb5_mech.c
+++ b/net/sunrpc/auth_gss/gss_krb5_mech.c
@@ -344,7 +344,7 @@ out_err:
         return PTR_ERR(p);
  }
  
-struct crypto_blkcipher *
+static struct crypto_blkcipher *
  context_v2_alloc_cipher(struct krb5_ctx *ctx, const char *cname, u8 *key)
  {
         struct crypto_blkcipher *cp;
diff --git a/net/sunrpc/auth_gss/gss_krb5_seal.c b/net/sunrpc/auth_gss/gss_krb5_seal.c

index d7941eab77969cacc26e4b17229e87c34d6894f3..62ae3273186cdd94545d26742ae7a2ece246a685 100644 (file)
--- a/net/sunrpc/auth_gss/gss_krb5_seal.c
+++ b/net/sunrpc/auth_gss/gss_krb5_seal.c
@@ -159,7 +159,7 @@ gss_get_mic_v1(struct krb5_ctx *ctx, struct xdr_buf *text,
         return (ctx->endtime < now) ? GSS_S_CONTEXT_EXPIRED : GSS_S_COMPLETE;
  }
  
-u32
+static u32
  gss_get_mic_v2(struct krb5_ctx *ctx, struct xdr_buf *text,
                 struct xdr_netobj *token)
  {
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c

index 8d0f7d3c71c80864356c6106c36886609c8c3713..1600cfb1618cd2abb4ae14be87f96554100a862a 100644 (file)
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -48,6 +48,8 @@
  #include <linux/sunrpc/svcauth_gss.h>
  #include <linux/sunrpc/cache.h>
  
+#include "../netns.h"
+
  #ifdef RPC_DEBUG
  # define RPCDBG_FACILITY       RPCDBG_AUTH
  #endif
@@ -75,10 +77,8 @@ struct rsi {
         int                     major_status, minor_status;
  };
  
-static struct cache_head *rsi_table[RSI_HASHMAX];
-static struct cache_detail rsi_cache;
-static struct rsi *rsi_update(struct rsi *new, struct rsi *old);
-static struct rsi *rsi_lookup(struct rsi *item);
+static struct rsi *rsi_update(struct cache_detail *cd, struct rsi *new, struct rsi *old);
+static struct rsi *rsi_lookup(struct cache_detail *cd, struct rsi *item);
  
  static void rsi_free(struct rsi *rsii)
  {
@@ -216,7 +216,7 @@ static int rsi_parse(struct cache_detail *cd,
         if (dup_to_netobj(&rsii.in_token, buf, len))
                 goto out;
  
-       rsip = rsi_lookup(&rsii);
+       rsip = rsi_lookup(cd, &rsii);
         if (!rsip)
                 goto out;
  
@@ -258,21 +258,20 @@ static int rsi_parse(struct cache_detail *cd,
         if (dup_to_netobj(&rsii.out_token, buf, len))
                 goto out;
         rsii.h.expiry_time = expiry;
-       rsip = rsi_update(&rsii, rsip);
+       rsip = rsi_update(cd, &rsii, rsip);
         status = 0;
  out:
         rsi_free(&rsii);
         if (rsip)
-               cache_put(&rsip->h, &rsi_cache);
+               cache_put(&rsip->h, cd);
         else
                 status = -ENOMEM;
         return status;
  }
  
-static struct cache_detail rsi_cache = {
+static struct cache_detail rsi_cache_template = {
         .owner          = THIS_MODULE,
         .hash_size      = RSI_HASHMAX,
-       .hash_table     = rsi_table,
         .name           = "auth.rpcsec.init",
         .cache_put      = rsi_put,
         .cache_upcall   = rsi_upcall,
@@ -283,24 +282,24 @@ static struct cache_detail rsi_cache = {
         .alloc          = rsi_alloc,
  };
  
-static struct rsi *rsi_lookup(struct rsi *item)
+static struct rsi *rsi_lookup(struct cache_detail *cd, struct rsi *item)
  {
         struct cache_head *ch;
         int hash = rsi_hash(item);
  
-       ch = sunrpc_cache_lookup(&rsi_cache, &item->h, hash);
+       ch = sunrpc_cache_lookup(cd, &item->h, hash);
         if (ch)
                 return container_of(ch, struct rsi, h);
         else
                 return NULL;
  }
  
-static struct rsi *rsi_update(struct rsi *new, struct rsi *old)
+static struct rsi *rsi_update(struct cache_detail *cd, struct rsi *new, struct rsi *old)
  {
         struct cache_head *ch;
         int hash = rsi_hash(new);
  
-       ch = sunrpc_cache_update(&rsi_cache, &new->h,
+       ch = sunrpc_cache_update(cd, &new->h,
                                  &old->h, hash);
         if (ch)
                 return container_of(ch, struct rsi, h);
@@ -339,10 +338,8 @@ struct rsc {
         char                    *client_name;
  };
  
-static struct cache_head *rsc_table[RSC_HASHMAX];
-static struct cache_detail rsc_cache;
-static struct rsc *rsc_update(struct rsc *new, struct rsc *old);
-static struct rsc *rsc_lookup(struct rsc *item);
+static struct rsc *rsc_update(struct cache_detail *cd, struct rsc *new, struct rsc *old);
+static struct rsc *rsc_lookup(struct cache_detail *cd, struct rsc *item);
  
  static void rsc_free(struct rsc *rsci)
  {
@@ -444,7 +441,7 @@ static int rsc_parse(struct cache_detail *cd,
         if (expiry == 0)
                 goto out;
  
-       rscp = rsc_lookup(&rsci);
+       rscp = rsc_lookup(cd, &rsci);
         if (!rscp)
                 goto out;
  
@@ -506,22 +503,21 @@ static int rsc_parse(struct cache_detail *cd,
  
         }
         rsci.h.expiry_time = expiry;
-       rscp = rsc_update(&rsci, rscp);
+       rscp = rsc_update(cd, &rsci, rscp);
         status = 0;
  out:
         gss_mech_put(gm);
         rsc_free(&rsci);
         if (rscp)
-               cache_put(&rscp->h, &rsc_cache);
+               cache_put(&rscp->h, cd);
         else
                 status = -ENOMEM;
         return status;
  }
  
-static struct cache_detail rsc_cache = {
+static struct cache_detail rsc_cache_template = {
         .owner          = THIS_MODULE,
         .hash_size      = RSC_HASHMAX,
-       .hash_table     = rsc_table,
         .name           = "auth.rpcsec.context",
         .cache_put      = rsc_put,
         .cache_parse    = rsc_parse,
@@ -531,24 +527,24 @@ static struct cache_detail rsc_cache = {
         .alloc          = rsc_alloc,
  };
  
-static struct rsc *rsc_lookup(struct rsc *item)
+static struct rsc *rsc_lookup(struct cache_detail *cd, struct rsc *item)
  {
         struct cache_head *ch;
         int hash = rsc_hash(item);
  
-       ch = sunrpc_cache_lookup(&rsc_cache, &item->h, hash);
+       ch = sunrpc_cache_lookup(cd, &item->h, hash);
         if (ch)
                 return container_of(ch, struct rsc, h);
         else
                 return NULL;
  }
  
-static struct rsc *rsc_update(struct rsc *new, struct rsc *old)
+static struct rsc *rsc_update(struct cache_detail *cd, struct rsc *new, struct rsc *old)
  {
         struct cache_head *ch;
         int hash = rsc_hash(new);
  
-       ch = sunrpc_cache_update(&rsc_cache, &new->h,
+       ch = sunrpc_cache_update(cd, &new->h,
                                  &old->h, hash);
         if (ch)
                 return container_of(ch, struct rsc, h);
@@ -558,7 +554,7 @@ static struct rsc *rsc_update(struct rsc *new, struct rsc *old)
  
  
  static struct rsc *
-gss_svc_searchbyctx(struct xdr_netobj *handle)
+gss_svc_searchbyctx(struct cache_detail *cd, struct xdr_netobj *handle)
  {
         struct rsc rsci;
         struct rsc *found;
@@ -566,11 +562,11 @@ gss_svc_searchbyctx(struct xdr_netobj *handle)
         memset(&rsci, 0, sizeof(rsci));
         if (dup_to_netobj(&rsci.handle, handle->data, handle->len))
                 return NULL;
-       found = rsc_lookup(&rsci);
+       found = rsc_lookup(cd, &rsci);
         rsc_free(&rsci);
         if (!found)
                 return NULL;
-       if (cache_check(&rsc_cache, &found->h, NULL))
+       if (cache_check(cd, &found->h, NULL))
                 return NULL;
         return found;
  }
@@ -968,20 +964,20 @@ svcauth_gss_set_client(struct svc_rqst *rqstp)
  }
  
  static inline int
-gss_write_init_verf(struct svc_rqst *rqstp, struct rsi *rsip)
+gss_write_init_verf(struct cache_detail *cd, struct svc_rqst *rqstp, struct rsi *rsip)
  {
         struct rsc *rsci;
         int        rc;
  
         if (rsip->major_status != GSS_S_COMPLETE)
                 return gss_write_null_verf(rqstp);
-       rsci = gss_svc_searchbyctx(&rsip->out_handle);
+       rsci = gss_svc_searchbyctx(cd, &rsip->out_handle);
         if (rsci == NULL) {
                 rsip->major_status = GSS_S_NO_CONTEXT;
                 return gss_write_null_verf(rqstp);
         }
         rc = gss_write_verf(rqstp, rsci->mechctx, GSS_SEQ_WIN);
-       cache_put(&rsci->h, &rsc_cache);
+       cache_put(&rsci->h, cd);
         return rc;
  }
  
@@ -1000,6 +996,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
         struct xdr_netobj tmpobj;
         struct rsi *rsip, rsikey;
         int ret;
+       struct sunrpc_net *sn = net_generic(rqstp->rq_xprt->xpt_net, sunrpc_net_id);
  
         /* Read the verifier; should be NULL: */
         *authp = rpc_autherr_badverf;
@@ -1028,17 +1025,17 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
         }
  
         /* Perform upcall, or find upcall result: */
-       rsip = rsi_lookup(&rsikey);
+       rsip = rsi_lookup(sn->rsi_cache, &rsikey);
         rsi_free(&rsikey);
         if (!rsip)
                 return SVC_CLOSE;
-       if (cache_check(&rsi_cache, &rsip->h, &rqstp->rq_chandle) < 0)
+       if (cache_check(sn->rsi_cache, &rsip->h, &rqstp->rq_chandle) < 0)
                 /* No upcall result: */
                 return SVC_CLOSE;
  
         ret = SVC_CLOSE;
         /* Got an answer to the upcall; use it: */
-       if (gss_write_init_verf(rqstp, rsip))
+       if (gss_write_init_verf(sn->rsc_cache, rqstp, rsip))
                 goto out;
         if (resv->iov_len + 4 > PAGE_SIZE)
                 goto out;
@@ -1055,7 +1052,7 @@ static int svcauth_gss_handle_init(struct svc_rqst *rqstp,
  
         ret = SVC_COMPLETE;
  out:
-       cache_put(&rsip->h, &rsi_cache);
+       cache_put(&rsip->h, sn->rsi_cache);
         return ret;
  }
  
@@ -1079,6 +1076,7 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
         __be32          *rpcstart;
         __be32          *reject_stat = resv->iov_base + resv->iov_len;
         int             ret;
+       struct sunrpc_net *sn = net_generic(rqstp->rq_xprt->xpt_net, sunrpc_net_id);
  
         dprintk("RPC:       svcauth_gss: argv->iov_len = %zd\n",
                         argv->iov_len);
@@ -1129,7 +1127,7 @@ svcauth_gss_accept(struct svc_rqst *rqstp, __be32 *authp)
         case RPC_GSS_PROC_DESTROY:
                 /* Look up the context, and check the verifier: */
                 *authp = rpcsec_gsserr_credproblem;
-               rsci = gss_svc_searchbyctx(&gc->gc_ctx);
+               rsci = gss_svc_searchbyctx(sn->rsc_cache, &gc->gc_ctx);
                 if (!rsci)
                         goto auth_err;
                 switch (gss_verify_header(rqstp, rsci, rpcstart, gc, authp)) {
@@ -1209,7 +1207,7 @@ drop:
         ret = SVC_DROP;
  out:
         if (rsci)
-               cache_put(&rsci->h, &rsc_cache);
+               cache_put(&rsci->h, sn->rsc_cache);
         return ret;
  }
  
@@ -1362,6 +1360,7 @@ svcauth_gss_release(struct svc_rqst *rqstp)
         struct rpc_gss_wire_cred *gc = &gsd->clcred;
         struct xdr_buf *resbuf = &rqstp->rq_res;
         int stat = -EINVAL;
+       struct sunrpc_net *sn = net_generic(rqstp->rq_xprt->xpt_net, sunrpc_net_id);
  
         if (gc->gc_proc != RPC_GSS_PROC_DATA)
                 goto out;
@@ -1404,7 +1403,7 @@ out_err:
                 put_group_info(rqstp->rq_cred.cr_group_info);
         rqstp->rq_cred.cr_group_info = NULL;
         if (gsd->rsci)
-               cache_put(&gsd->rsci->h, &rsc_cache);
+               cache_put(&gsd->rsci->h, sn->rsc_cache);
         gsd->rsci = NULL;
  
         return stat;
@@ -1429,30 +1428,96 @@ static struct auth_ops svcauthops_gss = {
         .set_client     = svcauth_gss_set_client,
  };
  
+static int rsi_cache_create_net(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd;
+       int err;
+
+       cd = cache_create_net(&rsi_cache_template, net);
+       if (IS_ERR(cd))
+               return PTR_ERR(cd);
+       err = cache_register_net(cd, net);
+       if (err) {
+               cache_destroy_net(cd, net);
+               return err;
+       }
+       sn->rsi_cache = cd;
+       return 0;
+}
+
+static void rsi_cache_destroy_net(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd = sn->rsi_cache;
+
+       sn->rsi_cache = NULL;
+       cache_purge(cd);
+       cache_unregister_net(cd, net);
+       cache_destroy_net(cd, net);
+}
+
+static int rsc_cache_create_net(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd;
+       int err;
+
+       cd = cache_create_net(&rsc_cache_template, net);
+       if (IS_ERR(cd))
+               return PTR_ERR(cd);
+       err = cache_register_net(cd, net);
+       if (err) {
+               cache_destroy_net(cd, net);
+               return err;
+       }
+       sn->rsc_cache = cd;
+       return 0;
+}
+
+static void rsc_cache_destroy_net(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd = sn->rsc_cache;
+
+       sn->rsc_cache = NULL;
+       cache_purge(cd);
+       cache_unregister_net(cd, net);
+       cache_destroy_net(cd, net);
+}
+
  int
-gss_svc_init(void)
+gss_svc_init_net(struct net *net)
  {
-       int rv = svc_auth_register(RPC_AUTH_GSS, &svcauthops_gss);
+       int rv;
+
+       rv = rsc_cache_create_net(net);
         if (rv)
                 return rv;
-       rv = cache_register(&rsc_cache);
+       rv = rsi_cache_create_net(net);
         if (rv)
                 goto out1;
-       rv = cache_register(&rsi_cache);
-       if (rv)
-               goto out2;
         return 0;
-out2:
-       cache_unregister(&rsc_cache);
  out1:
-       svc_auth_unregister(RPC_AUTH_GSS);
+       rsc_cache_destroy_net(net);
         return rv;
  }
  
+void
+gss_svc_shutdown_net(struct net *net)
+{
+       rsi_cache_destroy_net(net);
+       rsc_cache_destroy_net(net);
+}
+
+int
+gss_svc_init(void)
+{
+       return svc_auth_register(RPC_AUTH_GSS, &svcauthops_gss);
+}
+
  void
  gss_svc_shutdown(void)
  {
-       cache_unregister(&rsc_cache);
-       cache_unregister(&rsi_cache);
         svc_auth_unregister(RPC_AUTH_GSS);
  }
diff --git a/net/sunrpc/backchannel_rqst.c b/net/sunrpc/backchannel_rqst.c

index 3ad435a14ada7ecd4afce4374f7dfb32dca77c52..31def68a0f6e7032f232474b9a0d09a0083b9ac7 100644 (file)
--- a/net/sunrpc/backchannel_rqst.c
+++ b/net/sunrpc/backchannel_rqst.c
@@ -25,6 +25,7 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  #include <linux/slab.h>
  #include <linux/sunrpc/xprt.h>
  #include <linux/export.h>
+#include <linux/sunrpc/bc_xprt.h>
  
  #ifdef RPC_DEBUG
  #define RPCDBG_FACILITY        RPCDBG_TRANS
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c

index 465df9ae1046b7fc12fe99fd0759017be7a7dc2a..f21ece08876440d574dad1ac6a09cf22d40d043e 100644 (file)
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -344,7 +344,7 @@ static int current_index;
  static void do_cache_clean(struct work_struct *work);
  static struct delayed_work cache_cleaner;
  
-static void sunrpc_init_cache_detail(struct cache_detail *cd)
+void sunrpc_init_cache_detail(struct cache_detail *cd)
  {
         rwlock_init(&cd->hash_lock);
         INIT_LIST_HEAD(&cd->queue);
@@ -360,8 +360,9 @@ static void sunrpc_init_cache_detail(struct cache_detail *cd)
         /* start the cleaning process */
         schedule_delayed_work(&cache_cleaner, 0);
  }
+EXPORT_SYMBOL_GPL(sunrpc_init_cache_detail);
  
-static void sunrpc_destroy_cache_detail(struct cache_detail *cd)
+void sunrpc_destroy_cache_detail(struct cache_detail *cd)
  {
         cache_purge(cd);
         spin_lock(&cache_list_lock);
@@ -384,6 +385,7 @@ static void sunrpc_destroy_cache_detail(struct cache_detail *cd)
  out:
         printk(KERN_ERR "nfsd: failed to unregister %s cache\n", cd->name);
  }
+EXPORT_SYMBOL_GPL(sunrpc_destroy_cache_detail);
  
  /* clean cache tries to find something to clean
   * and cleans it.
@@ -1643,12 +1645,6 @@ int cache_register_net(struct cache_detail *cd, struct net *net)
  }
  EXPORT_SYMBOL_GPL(cache_register_net);
  
-int cache_register(struct cache_detail *cd)
-{
-       return cache_register_net(cd, &init_net);
-}
-EXPORT_SYMBOL_GPL(cache_register);
-
  void cache_unregister_net(struct cache_detail *cd, struct net *net)
  {
         remove_cache_proc_entries(cd, net);
@@ -1656,11 +1652,31 @@ void cache_unregister_net(struct cache_detail *cd, struct net *net)
  }
  EXPORT_SYMBOL_GPL(cache_unregister_net);
  
-void cache_unregister(struct cache_detail *cd)
+struct cache_detail *cache_create_net(struct cache_detail *tmpl, struct net *net)
+{
+       struct cache_detail *cd;
+
+       cd = kmemdup(tmpl, sizeof(struct cache_detail), GFP_KERNEL);
+       if (cd == NULL)
+               return ERR_PTR(-ENOMEM);
+
+       cd->hash_table = kzalloc(cd->hash_size * sizeof(struct cache_head *),
+                                GFP_KERNEL);
+       if (cd->hash_table == NULL) {
+               kfree(cd);
+               return ERR_PTR(-ENOMEM);
+       }
+       cd->net = net;
+       return cd;
+}
+EXPORT_SYMBOL_GPL(cache_create_net);
+
+void cache_destroy_net(struct cache_detail *cd, struct net *net)
  {
-       cache_unregister_net(cd, &init_net);
+       kfree(cd->hash_table);
+       kfree(cd);
  }
-EXPORT_SYMBOL_GPL(cache_unregister);
+EXPORT_SYMBOL_GPL(cache_destroy_net);
  
  static ssize_t cache_read_pipefs(struct file *filp, char __user *buf,
                                  size_t count, loff_t *ppos)
@@ -1787,17 +1803,14 @@ int sunrpc_cache_register_pipefs(struct dentry *parent,
         struct dentry *dir;
         int ret = 0;
  
-       sunrpc_init_cache_detail(cd);
         q.name = name;
         q.len = strlen(name);
         q.hash = full_name_hash(q.name, q.len);
         dir = rpc_create_cache_dir(parent, &q, umode, cd);
         if (!IS_ERR(dir))
                 cd->u.pipefs.dir = dir;
-       else {
-               sunrpc_destroy_cache_detail(cd);
+       else
                 ret = PTR_ERR(dir);
-       }
         return ret;
  }
  EXPORT_SYMBOL_GPL(sunrpc_cache_register_pipefs);
@@ -1806,7 +1819,6 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail *cd)
  {
         rpc_remove_cache_dir(cd->u.pipefs.dir);
         cd->u.pipefs.dir = NULL;
-       sunrpc_destroy_cache_detail(cd);
  }
  EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
  
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c

index f0268ea7e71121f3c07f159b6f46732b4074dd73..7a4cb5fdc21239d3628f44c912018399121b14e5 100644 (file)
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -31,13 +31,16 @@
  #include <linux/in.h>
  #include <linux/in6.h>
  #include <linux/un.h>
+#include <linux/rcupdate.h>
  
  #include <linux/sunrpc/clnt.h>
  #include <linux/sunrpc/rpc_pipe_fs.h>
  #include <linux/sunrpc/metrics.h>
  #include <linux/sunrpc/bc_xprt.h>
+#include <trace/events/sunrpc.h>
  
  #include "sunrpc.h"
+#include "netns.h"
  
  #ifdef RPC_DEBUG
  # define RPCDBG_FACILITY       RPCDBG_CALL
@@ -50,8 +53,6 @@
  /*
   * All RPC clients are linked into this list
   */
-static LIST_HEAD(all_clients);
-static DEFINE_SPINLOCK(rpc_client_lock);
  
  static DECLARE_WAIT_QUEUE_HEAD(destroy_wait);
  
@@ -81,82 +82,191 @@ static int rpc_ping(struct rpc_clnt *clnt);
  
  static void rpc_register_client(struct rpc_clnt *clnt)
  {
-       spin_lock(&rpc_client_lock);
-       list_add(&clnt->cl_clients, &all_clients);
-       spin_unlock(&rpc_client_lock);
+       struct net *net = rpc_net_ns(clnt);
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       spin_lock(&sn->rpc_client_lock);
+       list_add(&clnt->cl_clients, &sn->all_clients);
+       spin_unlock(&sn->rpc_client_lock);
  }
  
  static void rpc_unregister_client(struct rpc_clnt *clnt)
  {
-       spin_lock(&rpc_client_lock);
+       struct net *net = rpc_net_ns(clnt);
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       spin_lock(&sn->rpc_client_lock);
         list_del(&clnt->cl_clients);
-       spin_unlock(&rpc_client_lock);
+       spin_unlock(&sn->rpc_client_lock);
  }
  
-static int
-rpc_setup_pipedir(struct rpc_clnt *clnt, char *dir_name)
+static void __rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)
+{
+       if (clnt->cl_dentry) {
+               if (clnt->cl_auth && clnt->cl_auth->au_ops->pipes_destroy)
+                       clnt->cl_auth->au_ops->pipes_destroy(clnt->cl_auth);
+               rpc_remove_client_dir(clnt->cl_dentry);
+       }
+       clnt->cl_dentry = NULL;
+}
+
+static void rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)
+{
+       struct net *net = rpc_net_ns(clnt);
+       struct super_block *pipefs_sb;
+
+       pipefs_sb = rpc_get_sb_net(net);
+       if (pipefs_sb) {
+               __rpc_clnt_remove_pipedir(clnt);
+               rpc_put_sb_net(net);
+       }
+}
+
+static struct dentry *rpc_setup_pipedir_sb(struct super_block *sb,
+                                   struct rpc_clnt *clnt,
+                                   const char *dir_name)
  {
         static uint32_t clntid;
-       struct path path, dir;
         char name[15];
         struct qstr q = {
                 .name = name,
         };
+       struct dentry *dir, *dentry;
         int error;
  
-       clnt->cl_path.mnt = ERR_PTR(-ENOENT);
-       clnt->cl_path.dentry = ERR_PTR(-ENOENT);
-       if (dir_name == NULL)
-               return 0;
-
-       path.mnt = rpc_get_mount();
-       if (IS_ERR(path.mnt))
-               return PTR_ERR(path.mnt);
-       error = vfs_path_lookup(path.mnt->mnt_root, path.mnt, dir_name, 0, &dir);
-       if (error)
-               goto err;
-
+       dir = rpc_d_lookup_sb(sb, dir_name);
+       if (dir == NULL)
+               return dir;
         for (;;) {
                 q.len = snprintf(name, sizeof(name), "clnt%x", (unsigned int)clntid++);
                 name[sizeof(name) - 1] = '\0';
                 q.hash = full_name_hash(q.name, q.len);
-               path.dentry = rpc_create_client_dir(dir.dentry, &q, clnt);
-               if (!IS_ERR(path.dentry))
+               dentry = rpc_create_client_dir(dir, &q, clnt);
+               if (!IS_ERR(dentry))
                         break;
-               error = PTR_ERR(path.dentry);
+               error = PTR_ERR(dentry);
                 if (error != -EEXIST) {
                         printk(KERN_INFO "RPC: Couldn't create pipefs entry"
                                         " %s/%s, error %d\n",
                                         dir_name, name, error);
-                       goto err_path_put;
+                       break;
                 }
         }
-       path_put(&dir);
-       clnt->cl_path = path;
+       dput(dir);
+       return dentry;
+}
+
+static int
+rpc_setup_pipedir(struct rpc_clnt *clnt, const char *dir_name)
+{
+       struct net *net = rpc_net_ns(clnt);
+       struct super_block *pipefs_sb;
+       struct dentry *dentry;
+
+       clnt->cl_dentry = NULL;
+       if (dir_name == NULL)
+               return 0;
+       pipefs_sb = rpc_get_sb_net(net);
+       if (!pipefs_sb)
+               return 0;
+       dentry = rpc_setup_pipedir_sb(pipefs_sb, clnt, dir_name);
+       rpc_put_sb_net(net);
+       if (IS_ERR(dentry))
+               return PTR_ERR(dentry);
+       clnt->cl_dentry = dentry;
         return 0;
-err_path_put:
-       path_put(&dir);
-err:
-       rpc_put_mount();
+}
+
+static int __rpc_pipefs_event(struct rpc_clnt *clnt, unsigned long event,
+                               struct super_block *sb)
+{
+       struct dentry *dentry;
+       int err = 0;
+
+       switch (event) {
+       case RPC_PIPEFS_MOUNT:
+               if (clnt->cl_program->pipe_dir_name == NULL)
+                       break;
+               dentry = rpc_setup_pipedir_sb(sb, clnt,
+                                             clnt->cl_program->pipe_dir_name);
+               BUG_ON(dentry == NULL);
+               if (IS_ERR(dentry))
+                       return PTR_ERR(dentry);
+               clnt->cl_dentry = dentry;
+               if (clnt->cl_auth->au_ops->pipes_create) {
+                       err = clnt->cl_auth->au_ops->pipes_create(clnt->cl_auth);
+                       if (err)
+                               __rpc_clnt_remove_pipedir(clnt);
+               }
+               break;
+       case RPC_PIPEFS_UMOUNT:
+               __rpc_clnt_remove_pipedir(clnt);
+               break;
+       default:
+               printk(KERN_ERR "%s: unknown event: %ld\n", __func__, event);
+               return -ENOTSUPP;
+       }
+       return err;
+}
+
+static struct rpc_clnt *rpc_get_client_for_event(struct net *net, int event)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct rpc_clnt *clnt;
+
+       spin_lock(&sn->rpc_client_lock);
+       list_for_each_entry(clnt, &sn->all_clients, cl_clients) {
+               if (((event == RPC_PIPEFS_MOUNT) && clnt->cl_dentry) ||
+                   ((event == RPC_PIPEFS_UMOUNT) && !clnt->cl_dentry))
+                       continue;
+               atomic_inc(&clnt->cl_count);
+               spin_unlock(&sn->rpc_client_lock);
+               return clnt;
+       }
+       spin_unlock(&sn->rpc_client_lock);
+       return NULL;
+}
+
+static int rpc_pipefs_event(struct notifier_block *nb, unsigned long event,
+                           void *ptr)
+{
+       struct super_block *sb = ptr;
+       struct rpc_clnt *clnt;
+       int error = 0;
+
+       while ((clnt = rpc_get_client_for_event(sb->s_fs_info, event))) {
+               error = __rpc_pipefs_event(clnt, event, sb);
+               rpc_release_client(clnt);
+               if (error)
+                       break;
+       }
         return error;
  }
  
+static struct notifier_block rpc_clients_block = {
+       .notifier_call  = rpc_pipefs_event,
+       .priority       = SUNRPC_PIPEFS_RPC_PRIO,
+};
+
+int rpc_clients_notifier_register(void)
+{
+       return rpc_pipefs_notifier_register(&rpc_clients_block);
+}
+
+void rpc_clients_notifier_unregister(void)
+{
+       return rpc_pipefs_notifier_unregister(&rpc_clients_block);
+}
+
  static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, struct rpc_xprt *xprt)
  {
-       struct rpc_program      *program = args->program;
-       struct rpc_version      *version;
+       const struct rpc_program *program = args->program;
+       const struct rpc_version *version;
         struct rpc_clnt         *clnt = NULL;
         struct rpc_auth         *auth;
         int err;
-       size_t len;
  
         /* sanity check the name before trying to print it */
-       err = -EINVAL;
-       len = strlen(args->servername);
-       if (len > RPC_MAXNETNAMELEN)
-               goto out_no_rpciod;
-       len++;
-
         dprintk("RPC:       creating %s client for %s (xprt %p)\n",
                         program->name, args->servername, xprt);
  
@@ -179,17 +289,7 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
                 goto out_err;
         clnt->cl_parent = clnt;
  
-       clnt->cl_server = clnt->cl_inline_name;
-       if (len > sizeof(clnt->cl_inline_name)) {
-               char *buf = kmalloc(len, GFP_KERNEL);
-               if (buf != NULL)
-                       clnt->cl_server = buf;
-               else
-                       len = sizeof(clnt->cl_inline_name);
-       }
-       strlcpy(clnt->cl_server, args->servername, len);
-
-       clnt->cl_xprt     = xprt;
+       rcu_assign_pointer(clnt->cl_xprt, xprt);
         clnt->cl_procinfo = version->procs;
         clnt->cl_maxproc  = version->nrprocs;
         clnt->cl_protname = program->name;
@@ -204,7 +304,7 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
         INIT_LIST_HEAD(&clnt->cl_tasks);
         spin_lock_init(&clnt->cl_lock);
  
-       if (!xprt_bound(clnt->cl_xprt))
+       if (!xprt_bound(xprt))
                 clnt->cl_autobind = 1;
  
         clnt->cl_timeout = xprt->timeout;
@@ -246,17 +346,12 @@ static struct rpc_clnt * rpc_new_client(const struct rpc_create_args *args, stru
         return clnt;
  
  out_no_auth:
-       if (!IS_ERR(clnt->cl_path.dentry)) {
-               rpc_remove_client_dir(clnt->cl_path.dentry);
-               rpc_put_mount();
-       }
+       rpc_clnt_remove_pipedir(clnt);
  out_no_path:
         kfree(clnt->cl_principal);
  out_no_principal:
         rpc_free_iostats(clnt->cl_metrics);
  out_no_stats:
-       if (clnt->cl_server != clnt->cl_inline_name)
-               kfree(clnt->cl_server);
         kfree(clnt);
  out_err:
         xprt_put(xprt);
@@ -286,6 +381,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
                 .srcaddr = args->saddress,
                 .dstaddr = args->address,
                 .addrlen = args->addrsize,
+               .servername = args->servername,
                 .bc_xprt = args->bc_xprt,
         };
         char servername[48];
@@ -294,7 +390,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
          * If the caller chooses not to specify a hostname, whip
          * up a string representation of the passed-in address.
          */
-       if (args->servername == NULL) {
+       if (xprtargs.servername == NULL) {
                 struct sockaddr_un *sun =
                                 (struct sockaddr_un *)args->address;
                 struct sockaddr_in *sin =
@@ -321,7 +417,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
                          * address family isn't recognized. */
                         return ERR_PTR(-EINVAL);
                 }
-               args->servername = servername;
+               xprtargs.servername = servername;
         }
  
         xprt = xprt_create_transport(&xprtargs);
@@ -374,6 +470,7 @@ struct rpc_clnt *
  rpc_clone_client(struct rpc_clnt *clnt)
  {
         struct rpc_clnt *new;
+       struct rpc_xprt *xprt;
         int err = -ENOMEM;
  
         new = kmemdup(clnt, sizeof(*new), GFP_KERNEL);
@@ -393,18 +490,25 @@ rpc_clone_client(struct rpc_clnt *clnt)
                 if (new->cl_principal == NULL)
                         goto out_no_principal;
         }
+       rcu_read_lock();
+       xprt = xprt_get(rcu_dereference(clnt->cl_xprt));
+       rcu_read_unlock();
+       if (xprt == NULL)
+               goto out_no_transport;
+       rcu_assign_pointer(new->cl_xprt, xprt);
         atomic_set(&new->cl_count, 1);
         err = rpc_setup_pipedir(new, clnt->cl_program->pipe_dir_name);
         if (err != 0)
                 goto out_no_path;
         if (new->cl_auth)
                 atomic_inc(&new->cl_auth->au_count);
-       xprt_get(clnt->cl_xprt);
         atomic_inc(&clnt->cl_count);
         rpc_register_client(new);
         rpciod_up();
         return new;
  out_no_path:
+       xprt_put(xprt);
+out_no_transport:
         kfree(new->cl_principal);
  out_no_principal:
         rpc_free_iostats(new->cl_metrics);
@@ -453,8 +557,9 @@ EXPORT_SYMBOL_GPL(rpc_killall_tasks);
   */
  void rpc_shutdown_client(struct rpc_clnt *clnt)
  {
-       dprintk("RPC:       shutting down %s client for %s\n",
-                       clnt->cl_protname, clnt->cl_server);
+       dprintk_rcu("RPC:       shutting down %s client for %s\n",
+                       clnt->cl_protname,
+                       rcu_dereference(clnt->cl_xprt)->servername);
  
         while (!list_empty(&clnt->cl_tasks)) {
                 rpc_killall_tasks(clnt);
@@ -472,24 +577,17 @@ EXPORT_SYMBOL_GPL(rpc_shutdown_client);
  static void
  rpc_free_client(struct rpc_clnt *clnt)
  {
-       dprintk("RPC:       destroying %s client for %s\n",
-                       clnt->cl_protname, clnt->cl_server);
-       if (!IS_ERR(clnt->cl_path.dentry)) {
-               rpc_remove_client_dir(clnt->cl_path.dentry);
-               rpc_put_mount();
-       }
-       if (clnt->cl_parent != clnt) {
+       dprintk_rcu("RPC:       destroying %s client for %s\n",
+                       clnt->cl_protname,
+                       rcu_dereference(clnt->cl_xprt)->servername);
+       if (clnt->cl_parent != clnt)
                 rpc_release_client(clnt->cl_parent);
-               goto out_free;
-       }
-       if (clnt->cl_server != clnt->cl_inline_name)
-               kfree(clnt->cl_server);
-out_free:
         rpc_unregister_client(clnt);
+       rpc_clnt_remove_pipedir(clnt);
         rpc_free_iostats(clnt->cl_metrics);
         kfree(clnt->cl_principal);
         clnt->cl_metrics = NULL;
-       xprt_put(clnt->cl_xprt);
+       xprt_put(rcu_dereference_raw(clnt->cl_xprt));
         rpciod_down();
         kfree(clnt);
  }
@@ -542,11 +640,11 @@ rpc_release_client(struct rpc_clnt *clnt)
   * The Sun NFSv2/v3 ACL protocol can do this.
   */
  struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *old,
-                                     struct rpc_program *program,
+                                     const struct rpc_program *program,
                                       u32 vers)
  {
         struct rpc_clnt *clnt;
-       struct rpc_version *version;
+       const struct rpc_version *version;
         int err;
  
         BUG_ON(vers >= program->nrvers || !program->version[vers]);
@@ -778,13 +876,18 @@ EXPORT_SYMBOL_GPL(rpc_call_start);
  size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize)
  {
         size_t bytes;
-       struct rpc_xprt *xprt = clnt->cl_xprt;
+       struct rpc_xprt *xprt;
  
-       bytes = sizeof(xprt->addr);
+       rcu_read_lock();
+       xprt = rcu_dereference(clnt->cl_xprt);
+
+       bytes = xprt->addrlen;
         if (bytes > bufsize)
                 bytes = bufsize;
-       memcpy(buf, &clnt->cl_xprt->addr, bytes);
-       return xprt->addrlen;
+       memcpy(buf, &xprt->addr, bytes);
+       rcu_read_unlock();
+
+       return bytes;
  }
  EXPORT_SYMBOL_GPL(rpc_peeraddr);
  
@@ -793,11 +896,16 @@ EXPORT_SYMBOL_GPL(rpc_peeraddr);
   * @clnt: RPC client structure
   * @format: address format
   *
+ * NB: the lifetime of the memory referenced by the returned pointer is
+ * the same as the rpc_xprt itself.  As long as the caller uses this
+ * pointer, it must hold the RCU read lock.
   */
  const char *rpc_peeraddr2str(struct rpc_clnt *clnt,
                              enum rpc_display_format_t format)
  {
-       struct rpc_xprt *xprt = clnt->cl_xprt;
+       struct rpc_xprt *xprt;
+
+       xprt = rcu_dereference(clnt->cl_xprt);
  
         if (xprt->address_strings[format] != NULL)
                 return xprt->address_strings[format];
@@ -806,17 +914,203 @@ const char *rpc_peeraddr2str(struct rpc_clnt *clnt,
  }
  EXPORT_SYMBOL_GPL(rpc_peeraddr2str);
  
+static const struct sockaddr_in rpc_inaddr_loopback = {
+       .sin_family             = AF_INET,
+       .sin_addr.s_addr        = htonl(INADDR_ANY),
+};
+
+static const struct sockaddr_in6 rpc_in6addr_loopback = {
+       .sin6_family            = AF_INET6,
+       .sin6_addr              = IN6ADDR_ANY_INIT,
+};
+
+/*
+ * Try a getsockname() on a connected datagram socket.  Using a
+ * connected datagram socket prevents leaving a socket in TIME_WAIT.
+ * This conserves the ephemeral port number space.
+ *
+ * Returns zero and fills in "buf" if successful; otherwise, a
+ * negative errno is returned.
+ */
+static int rpc_sockname(struct net *net, struct sockaddr *sap, size_t salen,
+                       struct sockaddr *buf, int buflen)
+{
+       struct socket *sock;
+       int err;
+
+       err = __sock_create(net, sap->sa_family,
+                               SOCK_DGRAM, IPPROTO_UDP, &sock, 1);
+       if (err < 0) {
+               dprintk("RPC:       can't create UDP socket (%d)\n", err);
+               goto out;
+       }
+
+       switch (sap->sa_family) {
+       case AF_INET:
+               err = kernel_bind(sock,
+                               (struct sockaddr *)&rpc_inaddr_loopback,
+                               sizeof(rpc_inaddr_loopback));
+               break;
+       case AF_INET6:
+               err = kernel_bind(sock,
+                               (struct sockaddr *)&rpc_in6addr_loopback,
+                               sizeof(rpc_in6addr_loopback));
+               break;
+       default:
+               err = -EAFNOSUPPORT;
+               goto out;
+       }
+       if (err < 0) {
+               dprintk("RPC:       can't bind UDP socket (%d)\n", err);
+               goto out_release;
+       }
+
+       err = kernel_connect(sock, sap, salen, 0);
+       if (err < 0) {
+               dprintk("RPC:       can't connect UDP socket (%d)\n", err);
+               goto out_release;
+       }
+
+       err = kernel_getsockname(sock, buf, &buflen);
+       if (err < 0) {
+               dprintk("RPC:       getsockname failed (%d)\n", err);
+               goto out_release;
+       }
+
+       err = 0;
+       if (buf->sa_family == AF_INET6) {
+               struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)buf;
+               sin6->sin6_scope_id = 0;
+       }
+       dprintk("RPC:       %s succeeded\n", __func__);
+
+out_release:
+       sock_release(sock);
+out:
+       return err;
+}
+
+/*
+ * Scraping a connected socket failed, so we don't have a useable
+ * local address.  Fallback: generate an address that will prevent
+ * the server from calling us back.
+ *
+ * Returns zero and fills in "buf" if successful; otherwise, a
+ * negative errno is returned.
+ */
+static int rpc_anyaddr(int family, struct sockaddr *buf, size_t buflen)
+{
+       switch (family) {
+       case AF_INET:
+               if (buflen < sizeof(rpc_inaddr_loopback))
+                       return -EINVAL;
+               memcpy(buf, &rpc_inaddr_loopback,
+                               sizeof(rpc_inaddr_loopback));
+               break;
+       case AF_INET6:
+               if (buflen < sizeof(rpc_in6addr_loopback))
+                       return -EINVAL;
+               memcpy(buf, &rpc_in6addr_loopback,
+                               sizeof(rpc_in6addr_loopback));
+       default:
+               dprintk("RPC:       %s: address family not supported\n",
+                       __func__);
+               return -EAFNOSUPPORT;
+       }
+       dprintk("RPC:       %s: succeeded\n", __func__);
+       return 0;
+}
+
+/**
+ * rpc_localaddr - discover local endpoint address for an RPC client
+ * @clnt: RPC client structure
+ * @buf: target buffer
+ * @buflen: size of target buffer, in bytes
+ *
+ * Returns zero and fills in "buf" and "buflen" if successful;
+ * otherwise, a negative errno is returned.
+ *
+ * This works even if the underlying transport is not currently connected,
+ * or if the upper layer never previously provided a source address.
+ *
+ * The result of this function call is transient: multiple calls in
+ * succession may give different results, depending on how local
+ * networking configuration changes over time.
+ */
+int rpc_localaddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t buflen)
+{
+       struct sockaddr_storage address;
+       struct sockaddr *sap = (struct sockaddr *)&address;
+       struct rpc_xprt *xprt;
+       struct net *net;
+       size_t salen;
+       int err;
+
+       rcu_read_lock();
+       xprt = rcu_dereference(clnt->cl_xprt);
+       salen = xprt->addrlen;
+       memcpy(sap, &xprt->addr, salen);
+       net = get_net(xprt->xprt_net);
+       rcu_read_unlock();
+
+       rpc_set_port(sap, 0);
+       err = rpc_sockname(net, sap, salen, buf, buflen);
+       put_net(net);
+       if (err != 0)
+               /* Couldn't discover local address, return ANYADDR */
+               return rpc_anyaddr(sap->sa_family, buf, buflen);
+       return 0;
+}
+EXPORT_SYMBOL_GPL(rpc_localaddr);
+
  void
  rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize)
  {
-       struct rpc_xprt *xprt = clnt->cl_xprt;
+       struct rpc_xprt *xprt;
+
+       rcu_read_lock();
+       xprt = rcu_dereference(clnt->cl_xprt);
         if (xprt->ops->set_buffer_size)
                 xprt->ops->set_buffer_size(xprt, sndsize, rcvsize);
+       rcu_read_unlock();
  }
  EXPORT_SYMBOL_GPL(rpc_setbufsize);
  
-/*
- * Return size of largest payload RPC client can support, in bytes
+/**
+ * rpc_protocol - Get transport protocol number for an RPC client
+ * @clnt: RPC client to query
+ *
+ */
+int rpc_protocol(struct rpc_clnt *clnt)
+{
+       int protocol;
+
+       rcu_read_lock();
+       protocol = rcu_dereference(clnt->cl_xprt)->prot;
+       rcu_read_unlock();
+       return protocol;
+}
+EXPORT_SYMBOL_GPL(rpc_protocol);
+
+/**
+ * rpc_net_ns - Get the network namespace for this RPC client
+ * @clnt: RPC client to query
+ *
+ */
+struct net *rpc_net_ns(struct rpc_clnt *clnt)
+{
+       struct net *ret;
+
+       rcu_read_lock();
+       ret = rcu_dereference(clnt->cl_xprt)->xprt_net;
+       rcu_read_unlock();
+       return ret;
+}
+EXPORT_SYMBOL_GPL(rpc_net_ns);
+
+/**
+ * rpc_max_payload - Get maximum payload size for a transport, in bytes
+ * @clnt: RPC client to query
   *
   * For stream transports, this is one RPC record fragment (see RFC
   * 1831), as we don't support multi-record requests yet.  For datagram
@@ -825,7 +1119,12 @@ EXPORT_SYMBOL_GPL(rpc_setbufsize);
   */
  size_t rpc_max_payload(struct rpc_clnt *clnt)
  {
-       return clnt->cl_xprt->max_payload;
+       size_t ret;
+
+       rcu_read_lock();
+       ret = rcu_dereference(clnt->cl_xprt)->max_payload;
+       rcu_read_unlock();
+       return ret;
  }
  EXPORT_SYMBOL_GPL(rpc_max_payload);
  
@@ -836,8 +1135,11 @@ EXPORT_SYMBOL_GPL(rpc_max_payload);
   */
  void rpc_force_rebind(struct rpc_clnt *clnt)
  {
-       if (clnt->cl_autobind)
-               xprt_clear_bound(clnt->cl_xprt);
+       if (clnt->cl_autobind) {
+               rcu_read_lock();
+               xprt_clear_bound(rcu_dereference(clnt->cl_xprt));
+               rcu_read_unlock();
+       }
  }
  EXPORT_SYMBOL_GPL(rpc_force_rebind);
  
@@ -1163,6 +1465,7 @@ call_bind_status(struct rpc_task *task)
                 return;
         }
  
+       trace_rpc_bind_status(task);
         switch (task->tk_status) {
         case -ENOMEM:
                 dprintk("RPC: %5u rpcbind out of memory\n", task->tk_pid);
@@ -1262,6 +1565,7 @@ call_connect_status(struct rpc_task *task)
                 return;
         }
  
+       trace_rpc_connect_status(task, status);
         switch (status) {
                 /* if soft mounted, test if we've timed out */
         case -ETIMEDOUT:
@@ -1450,6 +1754,7 @@ call_status(struct rpc_task *task)
                 return;
         }
  
+       trace_rpc_call_status(task);
         task->tk_status = 0;
         switch(status) {
         case -EHOSTDOWN:
@@ -1513,8 +1818,11 @@ call_timeout(struct rpc_task *task)
         }
         if (RPC_IS_SOFT(task)) {
                 if (clnt->cl_chatty)
+                       rcu_read_lock();
                         printk(KERN_NOTICE "%s: server %s not responding, timed out\n",
-                               clnt->cl_protname, clnt->cl_server);
+                               clnt->cl_protname,
+                               rcu_dereference(clnt->cl_xprt)->servername);
+                       rcu_read_unlock();
                 if (task->tk_flags & RPC_TASK_TIMEOUT)
                         rpc_exit(task, -ETIMEDOUT);
                 else
@@ -1524,9 +1832,13 @@ call_timeout(struct rpc_task *task)
  
         if (!(task->tk_flags & RPC_CALL_MAJORSEEN)) {
                 task->tk_flags |= RPC_CALL_MAJORSEEN;
-               if (clnt->cl_chatty)
+               if (clnt->cl_chatty) {
+                       rcu_read_lock();
                         printk(KERN_NOTICE "%s: server %s not responding, still trying\n",
-                       clnt->cl_protname, clnt->cl_server);
+                       clnt->cl_protname,
+                       rcu_dereference(clnt->cl_xprt)->servername);
+                       rcu_read_unlock();
+               }
         }
         rpc_force_rebind(clnt);
         /*
@@ -1555,9 +1867,13 @@ call_decode(struct rpc_task *task)
         dprint_status(task);
  
         if (task->tk_flags & RPC_CALL_MAJORSEEN) {
-               if (clnt->cl_chatty)
+               if (clnt->cl_chatty) {
+                       rcu_read_lock();
                         printk(KERN_NOTICE "%s: server %s OK\n",
-                               clnt->cl_protname, clnt->cl_server);
+                               clnt->cl_protname,
+                               rcu_dereference(clnt->cl_xprt)->servername);
+                       rcu_read_unlock();
+               }
                 task->tk_flags &= ~RPC_CALL_MAJORSEEN;
         }
  
@@ -1635,6 +1951,7 @@ rpc_encode_header(struct rpc_task *task)
  static __be32 *
  rpc_verify_header(struct rpc_task *task)
  {
+       struct rpc_clnt *clnt = task->tk_client;
         struct kvec *iov = &task->tk_rqstp->rq_rcv_buf.head[0];
         int len = task->tk_rqstp->rq_rcv_buf.len >> 2;
         __be32  *p = iov->iov_base;
@@ -1707,8 +2024,11 @@ rpc_verify_header(struct rpc_task *task)
                         task->tk_action = call_bind;
                         goto out_retry;
                 case RPC_AUTH_TOOWEAK:
+                       rcu_read_lock();
                         printk(KERN_NOTICE "RPC: server %s requires stronger "
-                              "authentication.\n", task->tk_client->cl_server);
+                              "authentication.\n",
+                              rcu_dereference(clnt->cl_xprt)->servername);
+                       rcu_read_unlock();
                         break;
                 default:
                         dprintk("RPC: %5u %s: unknown auth error: %x\n",
@@ -1731,28 +2051,27 @@ rpc_verify_header(struct rpc_task *task)
         case RPC_SUCCESS:
                 return p;
         case RPC_PROG_UNAVAIL:
-               dprintk("RPC: %5u %s: program %u is unsupported by server %s\n",
-                               task->tk_pid, __func__,
-                               (unsigned int)task->tk_client->cl_prog,
-                               task->tk_client->cl_server);
+               dprintk_rcu("RPC: %5u %s: program %u is unsupported "
+                               "by server %s\n", task->tk_pid, __func__,
+                               (unsigned int)clnt->cl_prog,
+                               rcu_dereference(clnt->cl_xprt)->servername);
                 error = -EPFNOSUPPORT;
                 goto out_err;
         case RPC_PROG_MISMATCH:
-               dprintk("RPC: %5u %s: program %u, version %u unsupported by "
-                               "server %s\n", task->tk_pid, __func__,
-                               (unsigned int)task->tk_client->cl_prog,
-                               (unsigned int)task->tk_client->cl_vers,
-                               task->tk_client->cl_server);
+               dprintk_rcu("RPC: %5u %s: program %u, version %u unsupported "
+                               "by server %s\n", task->tk_pid, __func__,
+                               (unsigned int)clnt->cl_prog,
+                               (unsigned int)clnt->cl_vers,
+                               rcu_dereference(clnt->cl_xprt)->servername);
                 error = -EPROTONOSUPPORT;
                 goto out_err;
         case RPC_PROC_UNAVAIL:
-               dprintk("RPC: %5u %s: proc %s unsupported by program %u, "
+               dprintk_rcu("RPC: %5u %s: proc %s unsupported by program %u, "
                                 "version %u on server %s\n",
                                 task->tk_pid, __func__,
                                 rpc_proc_name(task),
-                               task->tk_client->cl_prog,
-                               task->tk_client->cl_vers,
-                               task->tk_client->cl_server);
+                               clnt->cl_prog, clnt->cl_vers,
+                               rcu_dereference(clnt->cl_xprt)->servername);
                 error = -EOPNOTSUPP;
                 goto out_err;
         case RPC_GARBAGE_ARGS:
@@ -1766,7 +2085,7 @@ rpc_verify_header(struct rpc_task *task)
         }
  
  out_garbage:
-       task->tk_client->cl_stats->rpcgarbage++;
+       clnt->cl_stats->rpcgarbage++;
         if (task->tk_garb_retry) {
                 task->tk_garb_retry--;
                 dprintk("RPC: %5u %s: retrying\n",
@@ -1852,14 +2171,15 @@ static void rpc_show_task(const struct rpc_clnt *clnt,
                 task->tk_action, rpc_waitq);
  }
  
-void rpc_show_tasks(void)
+void rpc_show_tasks(struct net *net)
  {
         struct rpc_clnt *clnt;
         struct rpc_task *task;
         int header = 0;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
  
-       spin_lock(&rpc_client_lock);
-       list_for_each_entry(clnt, &all_clients, cl_clients) {
+       spin_lock(&sn->rpc_client_lock);
+       list_for_each_entry(clnt, &sn->all_clients, cl_clients) {
                 spin_lock(&clnt->cl_lock);
                 list_for_each_entry(task, &clnt->cl_tasks, tk_task) {
                         if (!header) {
@@ -1870,6 +2190,6 @@ void rpc_show_tasks(void)
                 }
                 spin_unlock(&clnt->cl_lock);
         }
-       spin_unlock(&rpc_client_lock);
+       spin_unlock(&sn->rpc_client_lock);
  }
  #endif
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h

index d013bf211caeb0ca87874ee21d741466d359ea85..ce7bd449173dc3e1095d28909f5c23879cc0c477 100644 (file)
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -9,6 +9,20 @@ struct cache_detail;
  struct sunrpc_net {
         struct proc_dir_entry *proc_net_rpc;
         struct cache_detail *ip_map_cache;
+       struct cache_detail *unix_gid_cache;
+       struct cache_detail *rsc_cache;
+       struct cache_detail *rsi_cache;
+
+       struct super_block *pipefs_sb;
+       struct mutex pipefs_sb_lock;
+
+       struct list_head all_clients;
+       spinlock_t rpc_client_lock;
+
+       struct rpc_clnt *rpcb_local_clnt;
+       struct rpc_clnt *rpcb_local_clnt4;
+       spinlock_t rpcb_clnt_lock;
+       unsigned int rpcb_users;
  };
  
  extern int sunrpc_net_id;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c

index 7d6dd6efbdbe33020985ebccc4cad492f867f475..c84c0e0c41cb39dd41d37c3cc23c875b98c1e4d7 100644 (file)
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -16,9 +16,9 @@
  #include <linux/namei.h>
  #include <linux/fsnotify.h>
  #include <linux/kernel.h>
+#include <linux/rcupdate.h>
  
  #include <asm/ioctls.h>
-#include <linux/fs.h>
  #include <linux/poll.h>
  #include <linux/wait.h>
  #include <linux/seq_file.h>
@@ -27,9 +27,15 @@
  #include <linux/workqueue.h>
  #include <linux/sunrpc/rpc_pipe_fs.h>
  #include <linux/sunrpc/cache.h>
+#include <linux/nsproxy.h>
+#include <linux/notifier.h>
  
-static struct vfsmount *rpc_mnt __read_mostly;
-static int rpc_mount_count;
+#include "netns.h"
+#include "sunrpc.h"
+
+#define RPCDBG_FACILITY RPCDBG_DEBUG
+
+#define NET_NAME(net)  ((net == &init_net) ? " (init_net)" : "")
  
  static struct file_system_type rpc_pipe_fs_type;
  
@@ -38,7 +44,21 @@ static struct kmem_cache *rpc_inode_cachep __read_mostly;
  
  #define RPC_UPCALL_TIMEOUT (30*HZ)
  
-static void rpc_purge_list(struct rpc_inode *rpci, struct list_head *head,
+static BLOCKING_NOTIFIER_HEAD(rpc_pipefs_notifier_list);
+
+int rpc_pipefs_notifier_register(struct notifier_block *nb)
+{
+       return blocking_notifier_chain_cond_register(&rpc_pipefs_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(rpc_pipefs_notifier_register);
+
+void rpc_pipefs_notifier_unregister(struct notifier_block *nb)
+{
+       blocking_notifier_chain_unregister(&rpc_pipefs_notifier_list, nb);
+}
+EXPORT_SYMBOL_GPL(rpc_pipefs_notifier_unregister);
+
+static void rpc_purge_list(wait_queue_head_t *waitq, struct list_head *head,
                 void (*destroy_msg)(struct rpc_pipe_msg *), int err)
  {
         struct rpc_pipe_msg *msg;
@@ -51,30 +71,31 @@ static void rpc_purge_list(struct rpc_inode *rpci, struct list_head *head,
                 msg->errno = err;
                 destroy_msg(msg);
         } while (!list_empty(head));
-       wake_up(&rpci->waitq);
+       wake_up(waitq);
  }
  
  static void
  rpc_timeout_upcall_queue(struct work_struct *work)
  {
         LIST_HEAD(free_list);
-       struct rpc_inode *rpci =
-               container_of(work, struct rpc_inode, queue_timeout.work);
-       struct inode *inode = &rpci->vfs_inode;
+       struct rpc_pipe *pipe =
+               container_of(work, struct rpc_pipe, queue_timeout.work);
         void (*destroy_msg)(struct rpc_pipe_msg *);
+       struct dentry *dentry;
  
-       spin_lock(&inode->i_lock);
-       if (rpci->ops == NULL) {
-               spin_unlock(&inode->i_lock);
-               return;
+       spin_lock(&pipe->lock);
+       destroy_msg = pipe->ops->destroy_msg;
+       if (pipe->nreaders == 0) {
+               list_splice_init(&pipe->pipe, &free_list);
+               pipe->pipelen = 0;
         }
-       destroy_msg = rpci->ops->destroy_msg;
-       if (rpci->nreaders == 0) {
-               list_splice_init(&rpci->pipe, &free_list);
-               rpci->pipelen = 0;
+       dentry = dget(pipe->dentry);
+       spin_unlock(&pipe->lock);
+       if (dentry) {
+               rpc_purge_list(&RPC_I(dentry->d_inode)->waitq,
+                              &free_list, destroy_msg, -ETIMEDOUT);
+               dput(dentry);
         }
-       spin_unlock(&inode->i_lock);
-       rpc_purge_list(rpci, &free_list, destroy_msg, -ETIMEDOUT);
  }
  
  ssize_t rpc_pipe_generic_upcall(struct file *filp, struct rpc_pipe_msg *msg,
@@ -108,30 +129,31 @@ EXPORT_SYMBOL_GPL(rpc_pipe_generic_upcall);
   * initialize the fields of @msg (other than @msg->list) appropriately.
   */
  int
-rpc_queue_upcall(struct inode *inode, struct rpc_pipe_msg *msg)
+rpc_queue_upcall(struct rpc_pipe *pipe, struct rpc_pipe_msg *msg)
  {
-       struct rpc_inode *rpci = RPC_I(inode);
         int res = -EPIPE;
+       struct dentry *dentry;
  
-       spin_lock(&inode->i_lock);
-       if (rpci->ops == NULL)
-               goto out;
-       if (rpci->nreaders) {
-               list_add_tail(&msg->list, &rpci->pipe);
-               rpci->pipelen += msg->len;
+       spin_lock(&pipe->lock);
+       if (pipe->nreaders) {
+               list_add_tail(&msg->list, &pipe->pipe);
+               pipe->pipelen += msg->len;
                 res = 0;
-       } else if (rpci->flags & RPC_PIPE_WAIT_FOR_OPEN) {
-               if (list_empty(&rpci->pipe))
+       } else if (pipe->flags & RPC_PIPE_WAIT_FOR_OPEN) {
+               if (list_empty(&pipe->pipe))
                         queue_delayed_work(rpciod_workqueue,
-                                       &rpci->queue_timeout,
+                                       &pipe->queue_timeout,
                                         RPC_UPCALL_TIMEOUT);
-               list_add_tail(&msg->list, &rpci->pipe);
-               rpci->pipelen += msg->len;
+               list_add_tail(&msg->list, &pipe->pipe);
+               pipe->pipelen += msg->len;
                 res = 0;
         }
-out:
-       spin_unlock(&inode->i_lock);
-       wake_up(&rpci->waitq);
+       dentry = dget(pipe->dentry);
+       spin_unlock(&pipe->lock);
+       if (dentry) {
+               wake_up(&RPC_I(dentry->d_inode)->waitq);
+               dput(dentry);
+       }
         return res;
  }
  EXPORT_SYMBOL_GPL(rpc_queue_upcall);
@@ -145,29 +167,26 @@ rpc_inode_setowner(struct inode *inode, void *private)
  static void
  rpc_close_pipes(struct inode *inode)
  {
-       struct rpc_inode *rpci = RPC_I(inode);
-       const struct rpc_pipe_ops *ops;
+       struct rpc_pipe *pipe = RPC_I(inode)->pipe;
         int need_release;
+       LIST_HEAD(free_list);
  
         mutex_lock(&inode->i_mutex);
-       ops = rpci->ops;
-       if (ops != NULL) {
-               LIST_HEAD(free_list);
-               spin_lock(&inode->i_lock);
-               need_release = rpci->nreaders != 0 || rpci->nwriters != 0;
-               rpci->nreaders = 0;
-               list_splice_init(&rpci->in_upcall, &free_list);
-               list_splice_init(&rpci->pipe, &free_list);
-               rpci->pipelen = 0;
-               rpci->ops = NULL;
-               spin_unlock(&inode->i_lock);
-               rpc_purge_list(rpci, &free_list, ops->destroy_msg, -EPIPE);
-               rpci->nwriters = 0;
-               if (need_release && ops->release_pipe)
-                       ops->release_pipe(inode);
-               cancel_delayed_work_sync(&rpci->queue_timeout);
-       }
+       spin_lock(&pipe->lock);
+       need_release = pipe->nreaders != 0 || pipe->nwriters != 0;
+       pipe->nreaders = 0;
+       list_splice_init(&pipe->in_upcall, &free_list);
+       list_splice_init(&pipe->pipe, &free_list);
+       pipe->pipelen = 0;
+       pipe->dentry = NULL;
+       spin_unlock(&pipe->lock);
+       rpc_purge_list(&RPC_I(inode)->waitq, &free_list, pipe->ops->destroy_msg, -EPIPE);
+       pipe->nwriters = 0;
+       if (need_release && pipe->ops->release_pipe)
+               pipe->ops->release_pipe(inode);
+       cancel_delayed_work_sync(&pipe->queue_timeout);
         rpc_inode_setowner(inode, NULL);
+       RPC_I(inode)->pipe = NULL;
         mutex_unlock(&inode->i_mutex);
  }
  
@@ -197,23 +216,24 @@ rpc_destroy_inode(struct inode *inode)
  static int
  rpc_pipe_open(struct inode *inode, struct file *filp)
  {
-       struct rpc_inode *rpci = RPC_I(inode);
+       struct rpc_pipe *pipe;
         int first_open;
         int res = -ENXIO;
  
         mutex_lock(&inode->i_mutex);
-       if (rpci->ops == NULL)
+       pipe = RPC_I(inode)->pipe;
+       if (pipe == NULL)
                 goto out;
-       first_open = rpci->nreaders == 0 && rpci->nwriters == 0;
-       if (first_open && rpci->ops->open_pipe) {
-               res = rpci->ops->open_pipe(inode);
+       first_open = pipe->nreaders == 0 && pipe->nwriters == 0;
+       if (first_open && pipe->ops->open_pipe) {
+               res = pipe->ops->open_pipe(inode);
                 if (res)
                         goto out;
         }
         if (filp->f_mode & FMODE_READ)
-               rpci->nreaders++;
+               pipe->nreaders++;
         if (filp->f_mode & FMODE_WRITE)
-               rpci->nwriters++;
+               pipe->nwriters++;
         res = 0;
  out:
         mutex_unlock(&inode->i_mutex);
@@ -223,38 +243,39 @@ out:
  static int
  rpc_pipe_release(struct inode *inode, struct file *filp)
  {
-       struct rpc_inode *rpci = RPC_I(inode);
+       struct rpc_pipe *pipe;
         struct rpc_pipe_msg *msg;
         int last_close;
  
         mutex_lock(&inode->i_mutex);
-       if (rpci->ops == NULL)
+       pipe = RPC_I(inode)->pipe;
+       if (pipe == NULL)
                 goto out;
         msg = filp->private_data;
         if (msg != NULL) {
-               spin_lock(&inode->i_lock);
+               spin_lock(&pipe->lock);
                 msg->errno = -EAGAIN;
                 list_del_init(&msg->list);
-               spin_unlock(&inode->i_lock);
-               rpci->ops->destroy_msg(msg);
+               spin_unlock(&pipe->lock);
+               pipe->ops->destroy_msg(msg);
         }
         if (filp->f_mode & FMODE_WRITE)
-               rpci->nwriters --;
+               pipe->nwriters --;
         if (filp->f_mode & FMODE_READ) {
-               rpci->nreaders --;
-               if (rpci->nreaders == 0) {
+               pipe->nreaders --;
+               if (pipe->nreaders == 0) {
                         LIST_HEAD(free_list);
-                       spin_lock(&inode->i_lock);
-                       list_splice_init(&rpci->pipe, &free_list);
-                       rpci->pipelen = 0;
-                       spin_unlock(&inode->i_lock);
-                       rpc_purge_list(rpci, &free_list,
-                                       rpci->ops->destroy_msg, -EAGAIN);
+                       spin_lock(&pipe->lock);
+                       list_splice_init(&pipe->pipe, &free_list);
+                       pipe->pipelen = 0;
+                       spin_unlock(&pipe->lock);
+                       rpc_purge_list(&RPC_I(inode)->waitq, &free_list,
+                                       pipe->ops->destroy_msg, -EAGAIN);
                 }
         }
-       last_close = rpci->nwriters == 0 && rpci->nreaders == 0;
-       if (last_close && rpci->ops->release_pipe)
-               rpci->ops->release_pipe(inode);
+       last_close = pipe->nwriters == 0 && pipe->nreaders == 0;
+       if (last_close && pipe->ops->release_pipe)
+               pipe->ops->release_pipe(inode);
  out:
         mutex_unlock(&inode->i_mutex);
         return 0;
@@ -264,39 +285,40 @@ static ssize_t
  rpc_pipe_read(struct file *filp, char __user *buf, size_t len, loff_t *offset)
  {
         struct inode *inode = filp->f_path.dentry->d_inode;
-       struct rpc_inode *rpci = RPC_I(inode);
+       struct rpc_pipe *pipe;
         struct rpc_pipe_msg *msg;
         int res = 0;
  
         mutex_lock(&inode->i_mutex);
-       if (rpci->ops == NULL) {
+       pipe = RPC_I(inode)->pipe;
+       if (pipe == NULL) {
                 res = -EPIPE;
                 goto out_unlock;
         }
         msg = filp->private_data;
         if (msg == NULL) {
-               spin_lock(&inode->i_lock);
-               if (!list_empty(&rpci->pipe)) {
-                       msg = list_entry(rpci->pipe.next,
+               spin_lock(&pipe->lock);
+               if (!list_empty(&pipe->pipe)) {
+                       msg = list_entry(pipe->pipe.next,
                                         struct rpc_pipe_msg,
                                         list);
-                       list_move(&msg->list, &rpci->in_upcall);
-                       rpci->pipelen -= msg->len;
+                       list_move(&msg->list, &pipe->in_upcall);
+                       pipe->pipelen -= msg->len;
                         filp->private_data = msg;
                         msg->copied = 0;
                 }
-               spin_unlock(&inode->i_lock);
+               spin_unlock(&pipe->lock);
                 if (msg == NULL)
                         goto out_unlock;
         }
         /* NOTE: it is up to the callback to update msg->copied */
-       res = rpci->ops->upcall(filp, msg, buf, len);
+       res = pipe->ops->upcall(filp, msg, buf, len);
         if (res < 0 || msg->len == msg->copied) {
                 filp->private_data = NULL;
-               spin_lock(&inode->i_lock);
+               spin_lock(&pipe->lock);
                 list_del_init(&msg->list);
-               spin_unlock(&inode->i_lock);
-               rpci->ops->destroy_msg(msg);
+               spin_unlock(&pipe->lock);
+               pipe->ops->destroy_msg(msg);
         }
  out_unlock:
         mutex_unlock(&inode->i_mutex);
@@ -307,13 +329,12 @@ static ssize_t
  rpc_pipe_write(struct file *filp, const char __user *buf, size_t len, loff_t *offset)
  {
         struct inode *inode = filp->f_path.dentry->d_inode;
-       struct rpc_inode *rpci = RPC_I(inode);
         int res;
  
         mutex_lock(&inode->i_mutex);
         res = -EPIPE;
-       if (rpci->ops != NULL)
-               res = rpci->ops->downcall(filp, buf, len);
+       if (RPC_I(inode)->pipe != NULL)
+               res = RPC_I(inode)->pipe->ops->downcall(filp, buf, len);
         mutex_unlock(&inode->i_mutex);
         return res;
  }
@@ -321,17 +342,18 @@ rpc_pipe_write(struct file *filp, const char __user *buf, size_t len, loff_t *of
  static unsigned int
  rpc_pipe_poll(struct file *filp, struct poll_table_struct *wait)
  {
-       struct rpc_inode *rpci;
-       unsigned int mask = 0;
+       struct inode *inode = filp->f_path.dentry->d_inode;
+       struct rpc_inode *rpci = RPC_I(inode);
+       unsigned int mask = POLLOUT | POLLWRNORM;
  
-       rpci = RPC_I(filp->f_path.dentry->d_inode);
         poll_wait(filp, &rpci->waitq, wait);
  
-       mask = POLLOUT | POLLWRNORM;
-       if (rpci->ops == NULL)
+       mutex_lock(&inode->i_mutex);
+       if (rpci->pipe == NULL)
                 mask |= POLLERR | POLLHUP;
-       if (filp->private_data || !list_empty(&rpci->pipe))
+       else if (filp->private_data || !list_empty(&rpci->pipe->pipe))
                 mask |= POLLIN | POLLRDNORM;
+       mutex_unlock(&inode->i_mutex);
         return mask;
  }
  
@@ -339,23 +361,26 @@ static long
  rpc_pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
  {
         struct inode *inode = filp->f_path.dentry->d_inode;
-       struct rpc_inode *rpci = RPC_I(inode);
+       struct rpc_pipe *pipe;
         int len;
  
         switch (cmd) {
         case FIONREAD:
-               spin_lock(&inode->i_lock);
-               if (rpci->ops == NULL) {
-                       spin_unlock(&inode->i_lock);
+               mutex_lock(&inode->i_mutex);
+               pipe = RPC_I(inode)->pipe;
+               if (pipe == NULL) {
+                       mutex_unlock(&inode->i_mutex);
                         return -EPIPE;
                 }
-               len = rpci->pipelen;
+               spin_lock(&pipe->lock);
+               len = pipe->pipelen;
                 if (filp->private_data) {
                         struct rpc_pipe_msg *msg;
                         msg = filp->private_data;
                         len += msg->len - msg->copied;
                 }
-               spin_unlock(&inode->i_lock);
+               spin_unlock(&pipe->lock);
+               mutex_unlock(&inode->i_mutex);
                 return put_user(len, (int __user *)arg);
         default:
                 return -EINVAL;
@@ -378,12 +403,15 @@ rpc_show_info(struct seq_file *m, void *v)
  {
         struct rpc_clnt *clnt = m->private;
  
-       seq_printf(m, "RPC server: %s\n", clnt->cl_server);
+       rcu_read_lock();
+       seq_printf(m, "RPC server: %s\n",
+                       rcu_dereference(clnt->cl_xprt)->servername);
         seq_printf(m, "service: %s (%d) version %d\n", clnt->cl_protname,
                         clnt->cl_prog, clnt->cl_vers);
         seq_printf(m, "address: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR));
         seq_printf(m, "protocol: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_PROTO));
         seq_printf(m, "port: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_PORT));
+       rcu_read_unlock();
         return 0;
  }
  
@@ -440,23 +468,6 @@ struct rpc_filelist {
         umode_t mode;
  };
  
-struct vfsmount *rpc_get_mount(void)
-{
-       int err;
-
-       err = simple_pin_fs(&rpc_pipe_fs_type, &rpc_mnt, &rpc_mount_count);
-       if (err != 0)
-               return ERR_PTR(err);
-       return rpc_mnt;
-}
-EXPORT_SYMBOL_GPL(rpc_get_mount);
-
-void rpc_put_mount(void)
-{
-       simple_release_fs(&rpc_mnt, &rpc_mount_count);
-}
-EXPORT_SYMBOL_GPL(rpc_put_mount);
-
  static int rpc_delete_dentry(const struct dentry *dentry)
  {
         return 1;
@@ -540,12 +551,47 @@ static int __rpc_mkdir(struct inode *dir, struct dentry *dentry,
         return 0;
  }
  
-static int __rpc_mkpipe(struct inode *dir, struct dentry *dentry,
-                       umode_t mode,
-                       const struct file_operations *i_fop,
-                       void *private,
-                       const struct rpc_pipe_ops *ops,
-                       int flags)
+static void
+init_pipe(struct rpc_pipe *pipe)
+{
+       pipe->nreaders = 0;
+       pipe->nwriters = 0;
+       INIT_LIST_HEAD(&pipe->in_upcall);
+       INIT_LIST_HEAD(&pipe->in_downcall);
+       INIT_LIST_HEAD(&pipe->pipe);
+       pipe->pipelen = 0;
+       INIT_DELAYED_WORK(&pipe->queue_timeout,
+                           rpc_timeout_upcall_queue);
+       pipe->ops = NULL;
+       spin_lock_init(&pipe->lock);
+       pipe->dentry = NULL;
+}
+
+void rpc_destroy_pipe_data(struct rpc_pipe *pipe)
+{
+       kfree(pipe);
+}
+EXPORT_SYMBOL_GPL(rpc_destroy_pipe_data);
+
+struct rpc_pipe *rpc_mkpipe_data(const struct rpc_pipe_ops *ops, int flags)
+{
+       struct rpc_pipe *pipe;
+
+       pipe = kzalloc(sizeof(struct rpc_pipe), GFP_KERNEL);
+       if (!pipe)
+               return ERR_PTR(-ENOMEM);
+       init_pipe(pipe);
+       pipe->ops = ops;
+       pipe->flags = flags;
+       return pipe;
+}
+EXPORT_SYMBOL_GPL(rpc_mkpipe_data);
+
+static int __rpc_mkpipe_dentry(struct inode *dir, struct dentry *dentry,
+                              umode_t mode,
+                              const struct file_operations *i_fop,
+                              void *private,
+                              struct rpc_pipe *pipe)
  {
         struct rpc_inode *rpci;
         int err;
@@ -554,10 +600,8 @@ static int __rpc_mkpipe(struct inode *dir, struct dentry *dentry,
         if (err)
                 return err;
         rpci = RPC_I(dentry->d_inode);
-       rpci->nkern_readwriters = 1;
         rpci->private = private;
-       rpci->flags = flags;
-       rpci->ops = ops;
+       rpci->pipe = pipe;
         fsnotify_create(dir, dentry);
         return 0;
  }
@@ -573,6 +617,22 @@ static int __rpc_rmdir(struct inode *dir, struct dentry *dentry)
         return ret;
  }
  
+int rpc_rmdir(struct dentry *dentry)
+{
+       struct dentry *parent;
+       struct inode *dir;
+       int error;
+
+       parent = dget_parent(dentry);
+       dir = parent->d_inode;
+       mutex_lock_nested(&dir->i_mutex, I_MUTEX_PARENT);
+       error = __rpc_rmdir(dir, dentry);
+       mutex_unlock(&dir->i_mutex);
+       dput(parent);
+       return error;
+}
+EXPORT_SYMBOL_GPL(rpc_rmdir);
+
  static int __rpc_unlink(struct inode *dir, struct dentry *dentry)
  {
         int ret;
@@ -587,16 +647,12 @@ static int __rpc_unlink(struct inode *dir, struct dentry *dentry)
  static int __rpc_rmpipe(struct inode *dir, struct dentry *dentry)
  {
         struct inode *inode = dentry->d_inode;
-       struct rpc_inode *rpci = RPC_I(inode);
  
-       rpci->nkern_readwriters--;
-       if (rpci->nkern_readwriters != 0)
-               return 0;
         rpc_close_pipes(inode);
         return __rpc_unlink(dir, dentry);
  }
  
-static struct dentry *__rpc_lookup_create(struct dentry *parent,
+static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
                                           struct qstr *name)
  {
         struct dentry *dentry;
@@ -604,27 +660,13 @@ static struct dentry *__rpc_lookup_create(struct dentry *parent,
         dentry = d_lookup(parent, name);
         if (!dentry) {
                 dentry = d_alloc(parent, name);
-               if (!dentry) {
-                       dentry = ERR_PTR(-ENOMEM);
-                       goto out_err;
-               }
+               if (!dentry)
+                       return ERR_PTR(-ENOMEM);
         }
-       if (!dentry->d_inode)
+       if (dentry->d_inode == NULL) {
                 d_set_d_op(dentry, &rpc_dentry_operations);
-out_err:
-       return dentry;
-}
-
-static struct dentry *__rpc_lookup_create_exclusive(struct dentry *parent,
-                                         struct qstr *name)
-{
-       struct dentry *dentry;
-
-       dentry = __rpc_lookup_create(parent, name);
-       if (IS_ERR(dentry))
-               return dentry;
-       if (dentry->d_inode == NULL)
                 return dentry;
+       }
         dput(dentry);
         return ERR_PTR(-EEXIST);
  }
@@ -779,7 +821,7 @@ static int rpc_rmdir_depopulate(struct dentry *dentry,
   * @private: private data to associate with the pipe, for the caller's use
   * @ops: operations defining the behavior of the pipe: upcall, downcall,
   *     release_pipe, open_pipe, and destroy_msg.
- * @flags: rpc_inode flags
+ * @flags: rpc_pipe flags
   *
   * Data is made available for userspace to read by calls to
   * rpc_queue_upcall().  The actual reads will result in calls to
@@ -792,9 +834,8 @@ static int rpc_rmdir_depopulate(struct dentry *dentry,
   * The @private argument passed here will be available to all these methods
   * from the file pointer, via RPC_I(file->f_dentry->d_inode)->private.
   */
-struct dentry *rpc_mkpipe(struct dentry *parent, const char *name,
-                         void *private, const struct rpc_pipe_ops *ops,
-                         int flags)
+struct dentry *rpc_mkpipe_dentry(struct dentry *parent, const char *name,
+                                void *private, struct rpc_pipe *pipe)
  {
         struct dentry *dentry;
         struct inode *dir = parent->d_inode;
@@ -802,9 +843,9 @@ struct dentry *rpc_mkpipe(struct dentry *parent, const char *name,
         struct qstr q;
         int err;
  
-       if (ops->upcall == NULL)
+       if (pipe->ops->upcall == NULL)
                 umode &= ~S_IRUGO;
-       if (ops->downcall == NULL)
+       if (pipe->ops->downcall == NULL)
                 umode &= ~S_IWUGO;
  
         q.name = name;
@@ -812,24 +853,11 @@ struct dentry *rpc_mkpipe(struct dentry *parent, const char *name,
         q.hash = full_name_hash(q.name, q.len),
  
         mutex_lock_nested(&dir->i_mutex, I_MUTEX_PARENT);
-       dentry = __rpc_lookup_create(parent, &q);
+       dentry = __rpc_lookup_create_exclusive(parent, &q);
         if (IS_ERR(dentry))
                 goto out;
-       if (dentry->d_inode) {
-               struct rpc_inode *rpci = RPC_I(dentry->d_inode);
-               if (rpci->private != private ||
-                               rpci->ops != ops ||
-                               rpci->flags != flags) {
-                       dput (dentry);
-                       err = -EBUSY;
-                       goto out_err;
-               }
-               rpci->nkern_readwriters++;
-               goto out;
-       }
-
-       err = __rpc_mkpipe(dir, dentry, umode, &rpc_pipe_fops,
-                          private, ops, flags);
+       err = __rpc_mkpipe_dentry(dir, dentry, umode, &rpc_pipe_fops,
+                                 private, pipe);
         if (err)
                 goto out_err;
  out:
@@ -842,7 +870,7 @@ out_err:
                         err);
         goto out;
  }
-EXPORT_SYMBOL_GPL(rpc_mkpipe);
+EXPORT_SYMBOL_GPL(rpc_mkpipe_dentry);
  
  /**
   * rpc_unlink - remove a pipe
@@ -915,7 +943,7 @@ struct dentry *rpc_create_client_dir(struct dentry *dentry,
  
  /**
   * rpc_remove_client_dir - Remove a directory created with rpc_create_client_dir()
- * @dentry: directory to remove
+ * @clnt: rpc client
   */
  int rpc_remove_client_dir(struct dentry *dentry)
  {
@@ -1020,11 +1048,64 @@ static const struct rpc_filelist files[] = {
         },
  };
  
+/*
+ * This call can be used only in RPC pipefs mount notification hooks.
+ */
+struct dentry *rpc_d_lookup_sb(const struct super_block *sb,
+                              const unsigned char *dir_name)
+{
+       struct qstr dir = {
+               .name = dir_name,
+               .len = strlen(dir_name),
+               .hash = full_name_hash(dir_name, strlen(dir_name)),
+       };
+
+       return d_lookup(sb->s_root, &dir);
+}
+EXPORT_SYMBOL_GPL(rpc_d_lookup_sb);
+
+void rpc_pipefs_init_net(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       mutex_init(&sn->pipefs_sb_lock);
+}
+
+/*
+ * This call will be used for per network namespace operations calls.
+ * Note: Function will be returned with pipefs_sb_lock taken if superblock was
+ * found. This lock have to be released by rpc_put_sb_net() when all operations
+ * will be completed.
+ */
+struct super_block *rpc_get_sb_net(const struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       mutex_lock(&sn->pipefs_sb_lock);
+       if (sn->pipefs_sb)
+               return sn->pipefs_sb;
+       mutex_unlock(&sn->pipefs_sb_lock);
+       return NULL;
+}
+EXPORT_SYMBOL_GPL(rpc_get_sb_net);
+
+void rpc_put_sb_net(const struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       BUG_ON(sn->pipefs_sb == NULL);
+       mutex_unlock(&sn->pipefs_sb_lock);
+}
+EXPORT_SYMBOL_GPL(rpc_put_sb_net);
+
  static int
  rpc_fill_super(struct super_block *sb, void *data, int silent)
  {
         struct inode *inode;
         struct dentry *root;
+       struct net *net = data;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       int err;
  
         sb->s_blocksize = PAGE_CACHE_SIZE;
         sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
@@ -1038,21 +1119,54 @@ rpc_fill_super(struct super_block *sb, void *data, int silent)
                 return -ENOMEM;
         if (rpc_populate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF, NULL))
                 return -ENOMEM;
+       dprintk("RPC:   sending pipefs MOUNT notification for net %p%s\n", net,
+                                                               NET_NAME(net));
+       err = blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
+                                          RPC_PIPEFS_MOUNT,
+                                          sb);
+       if (err)
+               goto err_depopulate;
+       sb->s_fs_info = get_net(net);
+       sn->pipefs_sb = sb;
         return 0;
+
+err_depopulate:
+       blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
+                                          RPC_PIPEFS_UMOUNT,
+                                          sb);
+       __rpc_depopulate(root, files, RPCAUTH_lockd, RPCAUTH_RootEOF);
+       return err;
  }
  
  static struct dentry *
  rpc_mount(struct file_system_type *fs_type,
                 int flags, const char *dev_name, void *data)
  {
-       return mount_single(fs_type, flags, data, rpc_fill_super);
+       return mount_ns(fs_type, flags, current->nsproxy->net_ns, rpc_fill_super);
+}
+
+static void rpc_kill_sb(struct super_block *sb)
+{
+       struct net *net = sb->s_fs_info;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+       mutex_lock(&sn->pipefs_sb_lock);
+       sn->pipefs_sb = NULL;
+       mutex_unlock(&sn->pipefs_sb_lock);
+       put_net(net);
+       dprintk("RPC:   sending pipefs UMOUNT notification for net %p%s\n", net,
+                                                               NET_NAME(net));
+       blocking_notifier_call_chain(&rpc_pipefs_notifier_list,
+                                          RPC_PIPEFS_UMOUNT,
+                                          sb);
+       kill_litter_super(sb);
  }
  
  static struct file_system_type rpc_pipe_fs_type = {
         .owner          = THIS_MODULE,
         .name           = "rpc_pipefs",
         .mount          = rpc_mount,
-       .kill_sb        = kill_litter_super,
+       .kill_sb        = rpc_kill_sb,
  };
  
  static void
@@ -1062,16 +1176,8 @@ init_once(void *foo)
  
         inode_init_once(&rpci->vfs_inode);
         rpci->private = NULL;
-       rpci->nreaders = 0;
-       rpci->nwriters = 0;
-       INIT_LIST_HEAD(&rpci->in_upcall);
-       INIT_LIST_HEAD(&rpci->in_downcall);
-       INIT_LIST_HEAD(&rpci->pipe);
-       rpci->pipelen = 0;
+       rpci->pipe = NULL;
         init_waitqueue_head(&rpci->waitq);
-       INIT_DELAYED_WORK(&rpci->queue_timeout,
-                           rpc_timeout_upcall_queue);
-       rpci->ops = NULL;
  }
  
  int register_rpc_pipefs(void)
@@ -1085,17 +1191,24 @@ int register_rpc_pipefs(void)
                                 init_once);
         if (!rpc_inode_cachep)
                 return -ENOMEM;
+       err = rpc_clients_notifier_register();
+       if (err)
+               goto err_notifier;
         err = register_filesystem(&rpc_pipe_fs_type);
-       if (err) {
-               kmem_cache_destroy(rpc_inode_cachep);
-               return err;
-       }
-
+       if (err)
+               goto err_register;
         return 0;
+
+err_register:
+       rpc_clients_notifier_unregister();
+err_notifier:
+       kmem_cache_destroy(rpc_inode_cachep);
+       return err;
  }
  
  void unregister_rpc_pipefs(void)
  {
+       rpc_clients_notifier_unregister();
         kmem_cache_destroy(rpc_inode_cachep);
         unregister_filesystem(&rpc_pipe_fs_type);
  }
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c

index 8761bf8e36fc3cb8b41348d879093461c6ce98db..207a74696c9f84a62df704c94190ed28474c23b0 100644 (file)
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -23,12 +23,15 @@
  #include <linux/errno.h>
  #include <linux/mutex.h>
  #include <linux/slab.h>
+#include <linux/nsproxy.h>
  #include <net/ipv6.h>
  
  #include <linux/sunrpc/clnt.h>
  #include <linux/sunrpc/sched.h>
  #include <linux/sunrpc/xprtsock.h>
  
+#include "netns.h"
+
  #ifdef RPC_DEBUG
  # define RPCDBG_FACILITY       RPCDBG_BIND
  #endif
@@ -109,13 +112,7 @@ enum {
  
  static void                    rpcb_getport_done(struct rpc_task *, void *);
  static void                    rpcb_map_release(void *data);
-static struct rpc_program      rpcb_program;
-
-static struct rpc_clnt *       rpcb_local_clnt;
-static struct rpc_clnt *       rpcb_local_clnt4;
-
-DEFINE_SPINLOCK(rpcb_clnt_lock);
-unsigned int                   rpcb_users;
+static const struct rpc_program        rpcb_program;
  
  struct rpcbind_args {
         struct rpc_xprt *       r_xprt;
@@ -140,8 +137,8 @@ struct rpcb_info {
         struct rpc_procinfo *   rpc_proc;
  };
  
-static struct rpcb_info rpcb_next_version[];
-static struct rpcb_info rpcb_next_version6[];
+static const struct rpcb_info rpcb_next_version[];
+static const struct rpcb_info rpcb_next_version6[];
  
  static const struct rpc_call_ops rpcb_getport_ops = {
         .rpc_call_done          = rpcb_getport_done,
@@ -164,32 +161,34 @@ static void rpcb_map_release(void *data)
         kfree(map);
  }
  
-static int rpcb_get_local(void)
+static int rpcb_get_local(struct net *net)
  {
         int cnt;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
  
-       spin_lock(&rpcb_clnt_lock);
-       if (rpcb_users)
-               rpcb_users++;
-       cnt = rpcb_users;
-       spin_unlock(&rpcb_clnt_lock);
+       spin_lock(&sn->rpcb_clnt_lock);
+       if (sn->rpcb_users)
+               sn->rpcb_users++;
+       cnt = sn->rpcb_users;
+       spin_unlock(&sn->rpcb_clnt_lock);
  
         return cnt;
  }
  
-void rpcb_put_local(void)
+void rpcb_put_local(struct net *net)
  {
-       struct rpc_clnt *clnt = rpcb_local_clnt;
-       struct rpc_clnt *clnt4 = rpcb_local_clnt4;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct rpc_clnt *clnt = sn->rpcb_local_clnt;
+       struct rpc_clnt *clnt4 = sn->rpcb_local_clnt4;
         int shutdown;
  
-       spin_lock(&rpcb_clnt_lock);
-       if (--rpcb_users == 0) {
-               rpcb_local_clnt = NULL;
-               rpcb_local_clnt4 = NULL;
+       spin_lock(&sn->rpcb_clnt_lock);
+       if (--sn->rpcb_users == 0) {
+               sn->rpcb_local_clnt = NULL;
+               sn->rpcb_local_clnt4 = NULL;
         }
-       shutdown = !rpcb_users;
-       spin_unlock(&rpcb_clnt_lock);
+       shutdown = !sn->rpcb_users;
+       spin_unlock(&sn->rpcb_clnt_lock);
  
         if (shutdown) {
                 /*
@@ -202,30 +201,34 @@ void rpcb_put_local(void)
         }
  }
  
-static void rpcb_set_local(struct rpc_clnt *clnt, struct rpc_clnt *clnt4)
+static void rpcb_set_local(struct net *net, struct rpc_clnt *clnt,
+                       struct rpc_clnt *clnt4)
  {
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
         /* Protected by rpcb_create_local_mutex */
-       rpcb_local_clnt = clnt;
-       rpcb_local_clnt4 = clnt4;
+       sn->rpcb_local_clnt = clnt;
+       sn->rpcb_local_clnt4 = clnt4;
         smp_wmb(); 
-       rpcb_users = 1;
+       sn->rpcb_users = 1;
         dprintk("RPC:       created new rpcb local clients (rpcb_local_clnt: "
-                       "%p, rpcb_local_clnt4: %p)\n", rpcb_local_clnt,
-                       rpcb_local_clnt4);
+                       "%p, rpcb_local_clnt4: %p) for net %p%s\n",
+                       sn->rpcb_local_clnt, sn->rpcb_local_clnt4,
+                       net, (net == &init_net) ? " (init_net)" : "");
  }
  
  /*
   * Returns zero on success, otherwise a negative errno value
   * is returned.
   */
-static int rpcb_create_local_unix(void)
+static int rpcb_create_local_unix(struct net *net)
  {
         static const struct sockaddr_un rpcb_localaddr_rpcbind = {
                 .sun_family             = AF_LOCAL,
                 .sun_path               = RPCBIND_SOCK_PATHNAME,
         };
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = net,
                 .protocol       = XPRT_TRANSPORT_LOCAL,
                 .address        = (struct sockaddr *)&rpcb_localaddr_rpcbind,
                 .addrsize       = sizeof(rpcb_localaddr_rpcbind),
@@ -258,7 +261,7 @@ static int rpcb_create_local_unix(void)
                 clnt4 = NULL;
         }
  
-       rpcb_set_local(clnt, clnt4);
+       rpcb_set_local(net, clnt, clnt4);
  
  out:
         return result;
@@ -268,7 +271,7 @@ out:
   * Returns zero on success, otherwise a negative errno value
   * is returned.
   */
-static int rpcb_create_local_net(void)
+static int rpcb_create_local_net(struct net *net)
  {
         static const struct sockaddr_in rpcb_inaddr_loopback = {
                 .sin_family             = AF_INET,
@@ -276,7 +279,7 @@ static int rpcb_create_local_net(void)
                 .sin_port               = htons(RPCBIND_PORT),
         };
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = net,
                 .protocol       = XPRT_TRANSPORT_TCP,
                 .address        = (struct sockaddr *)&rpcb_inaddr_loopback,
                 .addrsize       = sizeof(rpcb_inaddr_loopback),
@@ -310,7 +313,7 @@ static int rpcb_create_local_net(void)
                 clnt4 = NULL;
         }
  
-       rpcb_set_local(clnt, clnt4);
+       rpcb_set_local(net, clnt, clnt4);
  
  out:
         return result;
@@ -320,31 +323,32 @@ out:
   * Returns zero on success, otherwise a negative errno value
   * is returned.
   */
-int rpcb_create_local(void)
+int rpcb_create_local(struct net *net)
  {
         static DEFINE_MUTEX(rpcb_create_local_mutex);
         int result = 0;
  
-       if (rpcb_get_local())
+       if (rpcb_get_local(net))
                 return result;
  
         mutex_lock(&rpcb_create_local_mutex);
-       if (rpcb_get_local())
+       if (rpcb_get_local(net))
                 goto out;
  
-       if (rpcb_create_local_unix() != 0)
-               result = rpcb_create_local_net();
+       if (rpcb_create_local_unix(net) != 0)
+               result = rpcb_create_local_net(net);
  
  out:
         mutex_unlock(&rpcb_create_local_mutex);
         return result;
  }
  
-static struct rpc_clnt *rpcb_create(char *hostname, struct sockaddr *srvaddr,
-                                   size_t salen, int proto, u32 version)
+static struct rpc_clnt *rpcb_create(struct net *net, const char *hostname,
+                                   struct sockaddr *srvaddr, size_t salen,
+                                   int proto, u32 version)
  {
         struct rpc_create_args args = {
-               .net            = &init_net,
+               .net            = net,
                 .protocol       = proto,
                 .address        = srvaddr,
                 .addrsize       = salen,
@@ -420,7 +424,7 @@ static int rpcb_register_call(struct rpc_clnt *clnt, struct rpc_message *msg)
   * IN6ADDR_ANY (ie available for all AF_INET and AF_INET6
   * addresses).
   */
-int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
+int rpcb_register(struct net *net, u32 prog, u32 vers, int prot, unsigned short port)
  {
         struct rpcbind_args map = {
                 .r_prog         = prog,
@@ -431,6 +435,7 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
         struct rpc_message msg = {
                 .rpc_argp       = &map,
         };
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
  
         dprintk("RPC:       %sregistering (%u, %u, %d, %u) with local "
                         "rpcbind\n", (port ? "" : "un"),
@@ -440,13 +445,14 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
         if (port)
                 msg.rpc_proc = &rpcb_procedures2[RPCBPROC_SET];
  
-       return rpcb_register_call(rpcb_local_clnt, &msg);
+       return rpcb_register_call(sn->rpcb_local_clnt, &msg);
  }
  
  /*
   * Fill in AF_INET family-specific arguments to register
   */
-static int rpcb_register_inet4(const struct sockaddr *sap,
+static int rpcb_register_inet4(struct sunrpc_net *sn,
+                              const struct sockaddr *sap,
                                struct rpc_message *msg)
  {
         const struct sockaddr_in *sin = (const struct sockaddr_in *)sap;
@@ -465,7 +471,7 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
         if (port)
                 msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
  
-       result = rpcb_register_call(rpcb_local_clnt4, msg);
+       result = rpcb_register_call(sn->rpcb_local_clnt4, msg);
         kfree(map->r_addr);
         return result;
  }
@@ -473,7 +479,8 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
  /*
   * Fill in AF_INET6 family-specific arguments to register
   */
-static int rpcb_register_inet6(const struct sockaddr *sap,
+static int rpcb_register_inet6(struct sunrpc_net *sn,
+                              const struct sockaddr *sap,
                                struct rpc_message *msg)
  {
         const struct sockaddr_in6 *sin6 = (const struct sockaddr_in6 *)sap;
@@ -492,12 +499,13 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
         if (port)
                 msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
  
-       result = rpcb_register_call(rpcb_local_clnt4, msg);
+       result = rpcb_register_call(sn->rpcb_local_clnt4, msg);
         kfree(map->r_addr);
         return result;
  }
  
-static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
+static int rpcb_unregister_all_protofamilies(struct sunrpc_net *sn,
+                                            struct rpc_message *msg)
  {
         struct rpcbind_args *map = msg->rpc_argp;
  
@@ -508,7 +516,7 @@ static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
         map->r_addr = "";
         msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET];
  
-       return rpcb_register_call(rpcb_local_clnt4, msg);
+       return rpcb_register_call(sn->rpcb_local_clnt4, msg);
  }
  
  /**
@@ -554,7 +562,7 @@ static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
   * service on any IPv4 address, but not on IPv6.  The latter
   * advertises the service on all IPv4 and IPv6 addresses.
   */
-int rpcb_v4_register(const u32 program, const u32 version,
+int rpcb_v4_register(struct net *net, const u32 program, const u32 version,
                      const struct sockaddr *address, const char *netid)
  {
         struct rpcbind_args map = {
@@ -566,18 +574,19 @@ int rpcb_v4_register(const u32 program, const u32 version,
         struct rpc_message msg = {
                 .rpc_argp       = &map,
         };
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
  
-       if (rpcb_local_clnt4 == NULL)
+       if (sn->rpcb_local_clnt4 == NULL)
                 return -EPROTONOSUPPORT;
  
         if (address == NULL)
-               return rpcb_unregister_all_protofamilies(&msg);
+               return rpcb_unregister_all_protofamilies(sn, &msg);
  
         switch (address->sa_family) {
         case AF_INET:
-               return rpcb_register_inet4(address, &msg);
+               return rpcb_register_inet4(sn, address, &msg);
         case AF_INET6:
-               return rpcb_register_inet6(address, &msg);
+               return rpcb_register_inet6(sn, address, &msg);
         }
  
         return -EAFNOSUPPORT;
@@ -611,9 +620,10 @@ static struct rpc_task *rpcb_call_async(struct rpc_clnt *rpcb_clnt, struct rpcbi
  static struct rpc_clnt *rpcb_find_transport_owner(struct rpc_clnt *clnt)
  {
         struct rpc_clnt *parent = clnt->cl_parent;
+       struct rpc_xprt *xprt = rcu_dereference(clnt->cl_xprt);
  
         while (parent != clnt) {
-               if (parent->cl_xprt != clnt->cl_xprt)
+               if (rcu_dereference(parent->cl_xprt) != xprt)
                         break;
                 if (clnt->cl_autobind)
                         break;
@@ -644,12 +654,16 @@ void rpcb_getport_async(struct rpc_task *task)
         size_t salen;
         int status;
  
-       clnt = rpcb_find_transport_owner(task->tk_client);
-       xprt = clnt->cl_xprt;
+       rcu_read_lock();
+       do {
+               clnt = rpcb_find_transport_owner(task->tk_client);
+               xprt = xprt_get(rcu_dereference(clnt->cl_xprt));
+       } while (xprt == NULL);
+       rcu_read_unlock();
  
         dprintk("RPC: %5u %s(%s, %u, %u, %d)\n",
                 task->tk_pid, __func__,
-               clnt->cl_server, clnt->cl_prog, clnt->cl_vers, xprt->prot);
+               xprt->servername, clnt->cl_prog, clnt->cl_vers, xprt->prot);
  
         /* Put self on the wait queue to ensure we get notified if
          * some other task is already attempting to bind the port */
@@ -658,6 +672,7 @@ void rpcb_getport_async(struct rpc_task *task)
         if (xprt_test_and_set_binding(xprt)) {
                 dprintk("RPC: %5u %s: waiting for another binder\n",
                         task->tk_pid, __func__);
+               xprt_put(xprt);
                 return;
         }
  
@@ -699,8 +714,8 @@ void rpcb_getport_async(struct rpc_task *task)
         dprintk("RPC: %5u %s: trying rpcbind version %u\n",
                 task->tk_pid, __func__, bind_version);
  
-       rpcb_clnt = rpcb_create(clnt->cl_server, sap, salen, xprt->prot,
-                               bind_version);
+       rpcb_clnt = rpcb_create(xprt->xprt_net, xprt->servername, sap, salen,
+                               xprt->prot, bind_version);
         if (IS_ERR(rpcb_clnt)) {
                 status = PTR_ERR(rpcb_clnt);
                 dprintk("RPC: %5u %s: rpcb_create failed, error %ld\n",
@@ -725,7 +740,7 @@ void rpcb_getport_async(struct rpc_task *task)
         switch (bind_version) {
         case RPCBVERS_4:
         case RPCBVERS_3:
-               map->r_netid = rpc_peeraddr2str(clnt, RPC_DISPLAY_NETID);
+               map->r_netid = xprt->address_strings[RPC_DISPLAY_NETID];
                 map->r_addr = rpc_sockaddr2uaddr(sap, GFP_ATOMIC);
                 map->r_owner = "";
                 break;
@@ -754,6 +769,7 @@ bailout_release_client:
  bailout_nofree:
         rpcb_wake_rpcbind_waiters(xprt, status);
         task->tk_status = status;
+       xprt_put(xprt);
  }
  EXPORT_SYMBOL_GPL(rpcb_getport_async);
  
@@ -801,11 +817,11 @@ static void rpcb_getport_done(struct rpc_task *child, void *data)
  static void rpcb_enc_mapping(struct rpc_rqst *req, struct xdr_stream *xdr,
                              const struct rpcbind_args *rpcb)
  {
-       struct rpc_task *task = req->rq_task;
         __be32 *p;
  
         dprintk("RPC: %5u encoding PMAP_%s call (%u, %u, %d, %u)\n",
-                       task->tk_pid, task->tk_msg.rpc_proc->p_name,
+                       req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name,
                         rpcb->r_prog, rpcb->r_vers, rpcb->r_prot, rpcb->r_port);
  
         p = xdr_reserve_space(xdr, RPCB_mappingargs_sz << 2);
@@ -818,7 +834,6 @@ static void rpcb_enc_mapping(struct rpc_rqst *req, struct xdr_stream *xdr,
  static int rpcb_dec_getport(struct rpc_rqst *req, struct xdr_stream *xdr,
                             struct rpcbind_args *rpcb)
  {
-       struct rpc_task *task = req->rq_task;
         unsigned long port;
         __be32 *p;
  
@@ -829,8 +844,8 @@ static int rpcb_dec_getport(struct rpc_rqst *req, struct xdr_stream *xdr,
                 return -EIO;
  
         port = be32_to_cpup(p);
-       dprintk("RPC: %5u PMAP_%s result: %lu\n", task->tk_pid,
-                       task->tk_msg.rpc_proc->p_name, port);
+       dprintk("RPC: %5u PMAP_%s result: %lu\n", req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name, port);
         if (unlikely(port > USHRT_MAX))
                 return -EIO;
  
@@ -841,7 +856,6 @@ static int rpcb_dec_getport(struct rpc_rqst *req, struct xdr_stream *xdr,
  static int rpcb_dec_set(struct rpc_rqst *req, struct xdr_stream *xdr,
                         unsigned int *boolp)
  {
-       struct rpc_task *task = req->rq_task;
         __be32 *p;
  
         p = xdr_inline_decode(xdr, 4);
@@ -853,7 +867,8 @@ static int rpcb_dec_set(struct rpc_rqst *req, struct xdr_stream *xdr,
                 *boolp = 1;
  
         dprintk("RPC: %5u RPCB_%s call %s\n",
-                       task->tk_pid, task->tk_msg.rpc_proc->p_name,
+                       req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name,
                         (*boolp ? "succeeded" : "failed"));
         return 0;
  }
@@ -873,11 +888,11 @@ static void encode_rpcb_string(struct xdr_stream *xdr, const char *string,
  static void rpcb_enc_getaddr(struct rpc_rqst *req, struct xdr_stream *xdr,
                              const struct rpcbind_args *rpcb)
  {
-       struct rpc_task *task = req->rq_task;
         __be32 *p;
  
         dprintk("RPC: %5u encoding RPCB_%s call (%u, %u, '%s', '%s')\n",
-                       task->tk_pid, task->tk_msg.rpc_proc->p_name,
+                       req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name,
                         rpcb->r_prog, rpcb->r_vers,
                         rpcb->r_netid, rpcb->r_addr);
  
@@ -895,7 +910,6 @@ static int rpcb_dec_getaddr(struct rpc_rqst *req, struct xdr_stream *xdr,
  {
         struct sockaddr_storage address;
         struct sockaddr *sap = (struct sockaddr *)&address;
-       struct rpc_task *task = req->rq_task;
         __be32 *p;
         u32 len;
  
@@ -912,7 +926,7 @@ static int rpcb_dec_getaddr(struct rpc_rqst *req, struct xdr_stream *xdr,
          */
         if (len == 0) {
                 dprintk("RPC: %5u RPCB reply: program not registered\n",
-                               task->tk_pid);
+                               req->rq_task->tk_pid);
                 return 0;
         }
  
@@ -922,10 +936,11 @@ static int rpcb_dec_getaddr(struct rpc_rqst *req, struct xdr_stream *xdr,
         p = xdr_inline_decode(xdr, len);
         if (unlikely(p == NULL))
                 goto out_fail;
-       dprintk("RPC: %5u RPCB_%s reply: %s\n", task->tk_pid,
-                       task->tk_msg.rpc_proc->p_name, (char *)p);
+       dprintk("RPC: %5u RPCB_%s reply: %s\n", req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name, (char *)p);
  
-       if (rpc_uaddr2sockaddr((char *)p, len, sap, sizeof(address)) == 0)
+       if (rpc_uaddr2sockaddr(req->rq_xprt->xprt_net, (char *)p, len,
+                               sap, sizeof(address)) == 0)
                 goto out_fail;
         rpcb->r_port = rpc_get_port(sap);
  
@@ -933,7 +948,8 @@ static int rpcb_dec_getaddr(struct rpc_rqst *req, struct xdr_stream *xdr,
  
  out_fail:
         dprintk("RPC: %5u malformed RPCB_%s reply\n",
-                       task->tk_pid, task->tk_msg.rpc_proc->p_name);
+                       req->rq_task->tk_pid,
+                       req->rq_task->tk_msg.rpc_proc->p_name);
         return -EIO;
  }
  
@@ -1041,7 +1057,7 @@ static struct rpc_procinfo rpcb_procedures4[] = {
         },
  };
  
-static struct rpcb_info rpcb_next_version[] = {
+static const struct rpcb_info rpcb_next_version[] = {
         {
                 .rpc_vers       = RPCBVERS_2,
                 .rpc_proc       = &rpcb_procedures2[RPCBPROC_GETPORT],
@@ -1051,7 +1067,7 @@ static struct rpcb_info rpcb_next_version[] = {
         },
  };
  
-static struct rpcb_info rpcb_next_version6[] = {
+static const struct rpcb_info rpcb_next_version6[] = {
         {
                 .rpc_vers       = RPCBVERS_4,
                 .rpc_proc       = &rpcb_procedures4[RPCBPROC_GETADDR],
@@ -1065,25 +1081,25 @@ static struct rpcb_info rpcb_next_version6[] = {
         },
  };
  
-static struct rpc_version rpcb_version2 = {
+static const struct rpc_version rpcb_version2 = {
         .number         = RPCBVERS_2,
         .nrprocs        = ARRAY_SIZE(rpcb_procedures2),
         .procs          = rpcb_procedures2
  };
  
-static struct rpc_version rpcb_version3 = {
+static const struct rpc_version rpcb_version3 = {
         .number         = RPCBVERS_3,
         .nrprocs        = ARRAY_SIZE(rpcb_procedures3),
         .procs          = rpcb_procedures3
  };
  
-static struct rpc_version rpcb_version4 = {
+static const struct rpc_version rpcb_version4 = {
         .number         = RPCBVERS_4,
         .nrprocs        = ARRAY_SIZE(rpcb_procedures4),
         .procs          = rpcb_procedures4
  };
  
-static struct rpc_version *rpcb_version[] = {
+static const struct rpc_version *rpcb_version[] = {
         NULL,
         NULL,
         &rpcb_version2,
@@ -1093,7 +1109,7 @@ static struct rpc_version *rpcb_version[] = {
  
  static struct rpc_stat rpcb_stats;
  
-static struct rpc_program rpcb_program = {
+static const struct rpc_program rpcb_program = {
         .name           = "rpcbind",
         .number         = RPCBIND_PROGRAM,
         .nrvers         = ARRAY_SIZE(rpcb_version),
diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c

index 3341d89627865308be08e38a78c828f29eafa655..994cfea2bad66f814432c2fcbb1227d0a321d34f 100644 (file)
--- a/net/sunrpc/sched.c
+++ b/net/sunrpc/sched.c
@@ -28,6 +28,9 @@
  #define RPCDBG_FACILITY                RPCDBG_SCHED
  #endif
  
+#define CREATE_TRACE_POINTS
+#include <trace/events/sunrpc.h>
+
  /*
   * RPC slabs and memory pools
   */
@@ -205,9 +208,7 @@ static void __rpc_init_priority_wait_queue(struct rpc_wait_queue *queue, const c
         queue->qlen = 0;
         setup_timer(&queue->timer_list.timer, __rpc_queue_timer_fn, (unsigned long)queue);
         INIT_LIST_HEAD(&queue->timer_list.list);
-#ifdef RPC_DEBUG
-       queue->name = qname;
-#endif
+       rpc_assign_waitqueue_name(queue, qname);
  }
  
  void rpc_init_priority_wait_queue(struct rpc_wait_queue *queue, const char *qname)
@@ -251,6 +252,8 @@ static inline void rpc_task_set_debuginfo(struct rpc_task *task)
  
  static void rpc_set_active(struct rpc_task *task)
  {
+       trace_rpc_task_begin(task->tk_client, task, NULL);
+
         rpc_task_set_debuginfo(task);
         set_bit(RPC_TASK_ACTIVE, &task->tk_runstate);
  }
@@ -267,6 +270,8 @@ static int rpc_complete_task(struct rpc_task *task)
         unsigned long flags;
         int ret;
  
+       trace_rpc_task_complete(task->tk_client, task, NULL);
+
         spin_lock_irqsave(&wq->lock, flags);
         clear_bit(RPC_TASK_ACTIVE, &task->tk_runstate);
         ret = atomic_dec_and_test(&task->tk_count);
@@ -324,6 +329,8 @@ static void __rpc_sleep_on_priority(struct rpc_wait_queue *q,
         dprintk("RPC: %5u sleep_on(queue \"%s\" time %lu)\n",
                         task->tk_pid, rpc_qname(q), jiffies);
  
+       trace_rpc_task_sleep(task->tk_client, task, q);
+
         __rpc_add_wait_queue(q, task, queue_priority);
  
         BUG_ON(task->tk_callback != NULL);
@@ -378,6 +385,8 @@ static void __rpc_do_wake_up_task(struct rpc_wait_queue *queue, struct rpc_task
                 return;
         }
  
+       trace_rpc_task_wakeup(task->tk_client, task, queue);
+
         __rpc_remove_wait_queue(queue, task);
  
         rpc_make_runnable(task);
@@ -422,7 +431,7 @@ EXPORT_SYMBOL_GPL(rpc_wake_up_queued_task);
  /*
   * Wake up the next task on a priority queue.
   */
-static struct rpc_task * __rpc_wake_up_next_priority(struct rpc_wait_queue *queue)
+static struct rpc_task *__rpc_find_next_queued_priority(struct rpc_wait_queue *queue)
  {
         struct list_head *q;
         struct rpc_task *task;
@@ -467,30 +476,54 @@ new_queue:
  new_owner:
         rpc_set_waitqueue_owner(queue, task->tk_owner);
  out:
-       rpc_wake_up_task_queue_locked(queue, task);
         return task;
  }
  
+static struct rpc_task *__rpc_find_next_queued(struct rpc_wait_queue *queue)
+{
+       if (RPC_IS_PRIORITY(queue))
+               return __rpc_find_next_queued_priority(queue);
+       if (!list_empty(&queue->tasks[0]))
+               return list_first_entry(&queue->tasks[0], struct rpc_task, u.tk_wait.list);
+       return NULL;
+}
+
  /*
- * Wake up the next task on the wait queue.
+ * Wake up the first task on the wait queue.
   */
-struct rpc_task * rpc_wake_up_next(struct rpc_wait_queue *queue)
+struct rpc_task *rpc_wake_up_first(struct rpc_wait_queue *queue,
+               bool (*func)(struct rpc_task *, void *), void *data)
  {
         struct rpc_task *task = NULL;
  
-       dprintk("RPC:       wake_up_next(%p \"%s\")\n",
+       dprintk("RPC:       wake_up_first(%p \"%s\")\n",
                         queue, rpc_qname(queue));
         spin_lock_bh(&queue->lock);
-       if (RPC_IS_PRIORITY(queue))
-               task = __rpc_wake_up_next_priority(queue);
-       else {
-               task_for_first(task, &queue->tasks[0])
+       task = __rpc_find_next_queued(queue);
+       if (task != NULL) {
+               if (func(task, data))
                         rpc_wake_up_task_queue_locked(queue, task);
+               else
+                       task = NULL;
         }
         spin_unlock_bh(&queue->lock);
  
         return task;
  }
+EXPORT_SYMBOL_GPL(rpc_wake_up_first);
+
+static bool rpc_wake_up_next_func(struct rpc_task *task, void *data)
+{
+       return true;
+}
+
+/*
+ * Wake up the next task on the wait queue.
+*/
+struct rpc_task *rpc_wake_up_next(struct rpc_wait_queue *queue)
+{
+       return rpc_wake_up_first(queue, rpc_wake_up_next_func, NULL);
+}
  EXPORT_SYMBOL_GPL(rpc_wake_up_next);
  
  /**
@@ -501,14 +534,18 @@ EXPORT_SYMBOL_GPL(rpc_wake_up_next);
   */
  void rpc_wake_up(struct rpc_wait_queue *queue)
  {
-       struct rpc_task *task, *next;
         struct list_head *head;
  
         spin_lock_bh(&queue->lock);
         head = &queue->tasks[queue->maxpriority];
         for (;;) {
-               list_for_each_entry_safe(task, next, head, u.tk_wait.list)
+               while (!list_empty(head)) {
+                       struct rpc_task *task;
+                       task = list_first_entry(head,
+                                       struct rpc_task,
+                                       u.tk_wait.list);
                         rpc_wake_up_task_queue_locked(queue, task);
+               }
                 if (head == &queue->tasks[0])
                         break;
                 head--;
@@ -526,13 +563,16 @@ EXPORT_SYMBOL_GPL(rpc_wake_up);
   */
  void rpc_wake_up_status(struct rpc_wait_queue *queue, int status)
  {
-       struct rpc_task *task, *next;
         struct list_head *head;
  
         spin_lock_bh(&queue->lock);
         head = &queue->tasks[queue->maxpriority];
         for (;;) {
-               list_for_each_entry_safe(task, next, head, u.tk_wait.list) {
+               while (!list_empty(head)) {
+                       struct rpc_task *task;
+                       task = list_first_entry(head,
+                                       struct rpc_task,
+                                       u.tk_wait.list);
                         task->tk_status = status;
                         rpc_wake_up_task_queue_locked(queue, task);
                 }
@@ -677,6 +717,7 @@ static void __rpc_execute(struct rpc_task *task)
                         if (do_action == NULL)
                                 break;
                 }
+               trace_rpc_task_run_action(task->tk_client, task, task->tk_action);
                 do_action(task);
  
                 /*
diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c

index 80df89d957ba02dce1b977ec19f96b51d2c657b0..bc2068ee795b95d7fdec5d57503ab3919ff3e97d 100644 (file)
--- a/net/sunrpc/stats.c
+++ b/net/sunrpc/stats.c
@@ -22,6 +22,7 @@
  #include <linux/sunrpc/clnt.h>
  #include <linux/sunrpc/svcsock.h>
  #include <linux/sunrpc/metrics.h>
+#include <linux/rcupdate.h>
  
  #include "netns.h"
  
@@ -133,20 +134,19 @@ EXPORT_SYMBOL_GPL(rpc_free_iostats);
  /**
   * rpc_count_iostats - tally up per-task stats
   * @task: completed rpc_task
+ * @stats: array of stat structures
   *
   * Relies on the caller for serialization.
   */
-void rpc_count_iostats(struct rpc_task *task)
+void rpc_count_iostats(const struct rpc_task *task, struct rpc_iostats *stats)
  {
         struct rpc_rqst *req = task->tk_rqstp;
-       struct rpc_iostats *stats;
         struct rpc_iostats *op_metrics;
         ktime_t delta;
  
-       if (!task->tk_client || !task->tk_client->cl_metrics || !req)
+       if (!stats || !req)
                 return;
  
-       stats = task->tk_client->cl_metrics;
         op_metrics = &stats[task->tk_msg.rpc_proc->p_statidx];
  
         op_metrics->om_ops++;
@@ -164,6 +164,7 @@ void rpc_count_iostats(struct rpc_task *task)
         delta = ktime_sub(ktime_get(), task->tk_start);
         op_metrics->om_execute = ktime_add(op_metrics->om_execute, delta);
  }
+EXPORT_SYMBOL_GPL(rpc_count_iostats);
  
  static void _print_name(struct seq_file *seq, unsigned int op,
                         struct rpc_procinfo *procs)
@@ -179,7 +180,7 @@ static void _print_name(struct seq_file *seq, unsigned int op,
  void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
  {
         struct rpc_iostats *stats = clnt->cl_metrics;
-       struct rpc_xprt *xprt = clnt->cl_xprt;
+       struct rpc_xprt *xprt;
         unsigned int op, maxproc = clnt->cl_maxproc;
  
         if (!stats)
@@ -189,8 +190,11 @@ void rpc_print_iostats(struct seq_file *seq, struct rpc_clnt *clnt)
         seq_printf(seq, "p/v: %u/%u (%s)\n",
                         clnt->cl_prog, clnt->cl_vers, clnt->cl_protname);
  
+       rcu_read_lock();
+       xprt = rcu_dereference(clnt->cl_xprt);
         if (xprt)
                 xprt->ops->print_stats(xprt, seq);
+       rcu_read_unlock();
  
         seq_printf(seq, "\tper-op statistics\n");
         for (op = 0; op < maxproc; op++) {
@@ -213,45 +217,46 @@ EXPORT_SYMBOL_GPL(rpc_print_iostats);
   * Register/unregister RPC proc files
   */
  static inline struct proc_dir_entry *
-do_register(const char *name, void *data, const struct file_operations *fops)
+do_register(struct net *net, const char *name, void *data,
+           const struct file_operations *fops)
  {
         struct sunrpc_net *sn;
  
         dprintk("RPC:       registering /proc/net/rpc/%s\n", name);
-       sn = net_generic(&init_net, sunrpc_net_id);
+       sn = net_generic(net, sunrpc_net_id);
         return proc_create_data(name, 0, sn->proc_net_rpc, fops, data);
  }
  
  struct proc_dir_entry *
-rpc_proc_register(struct rpc_stat *statp)
+rpc_proc_register(struct net *net, struct rpc_stat *statp)
  {
-       return do_register(statp->program->name, statp, &rpc_proc_fops);
+       return do_register(net, statp->program->name, statp, &rpc_proc_fops);
  }
  EXPORT_SYMBOL_GPL(rpc_proc_register);
  
  void
-rpc_proc_unregister(const char *name)
+rpc_proc_unregister(struct net *net, const char *name)
  {
         struct sunrpc_net *sn;
  
-       sn = net_generic(&init_net, sunrpc_net_id);
+       sn = net_generic(net, sunrpc_net_id);
         remove_proc_entry(name, sn->proc_net_rpc);
  }
  EXPORT_SYMBOL_GPL(rpc_proc_unregister);
  
  struct proc_dir_entry *
-svc_proc_register(struct svc_stat *statp, const struct file_operations *fops)
+svc_proc_register(struct net *net, struct svc_stat *statp, const struct file_operations *fops)
  {
-       return do_register(statp->program->pg_name, statp, fops);
+       return do_register(net, statp->program->pg_name, statp, fops);
  }
  EXPORT_SYMBOL_GPL(svc_proc_register);
  
  void
-svc_proc_unregister(const char *name)
+svc_proc_unregister(struct net *net, const char *name)
  {
         struct sunrpc_net *sn;
  
-       sn = net_generic(&init_net, sunrpc_net_id);
+       sn = net_generic(net, sunrpc_net_id);
         remove_proc_entry(name, sn->proc_net_rpc);
  }
  EXPORT_SYMBOL_GPL(svc_proc_unregister);
diff --git a/net/sunrpc/sunrpc.h b/net/sunrpc/sunrpc.h

index 90c292e2738b5f7248db0f267aa63c1b88f7d8b2..14c9f6d1c5ff22987e6dfdbf2ce9a95508ee4696 100644 (file)
--- a/net/sunrpc/sunrpc.h
+++ b/net/sunrpc/sunrpc.h
@@ -47,5 +47,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
                     struct page *headpage, unsigned long headoffset,
                     struct page *tailpage, unsigned long tailoffset);
  
+int rpc_clients_notifier_register(void);
+void rpc_clients_notifier_unregister(void);
  #endif /* _NET_SUNRPC_SUNRPC_H */
  
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c

index 8ec9778c3f4ad959991fe38308e999f93e5adf3d..8adfc88e793a72308f72012bd30e447cd40dd6bf 100644 (file)
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -25,10 +25,12 @@
  #include "netns.h"
  
  int sunrpc_net_id;
+EXPORT_SYMBOL_GPL(sunrpc_net_id);
  
  static __net_init int sunrpc_init_net(struct net *net)
  {
         int err;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
  
         err = rpc_proc_init(net);
         if (err)
@@ -38,8 +40,18 @@ static __net_init int sunrpc_init_net(struct net *net)
         if (err)
                 goto err_ipmap;
  
+       err = unix_gid_cache_create(net);
+       if (err)
+               goto err_unixgid;
+
+       rpc_pipefs_init_net(net);
+       INIT_LIST_HEAD(&sn->all_clients);
+       spin_lock_init(&sn->rpc_client_lock);
+       spin_lock_init(&sn->rpcb_clnt_lock);
         return 0;
  
+err_unixgid:
+       ip_map_cache_destroy(net);
  err_ipmap:
         rpc_proc_exit(net);
  err_proc:
@@ -48,6 +60,7 @@ err_proc:
  
  static __net_exit void sunrpc_exit_net(struct net *net)
  {
+       unix_gid_cache_destroy(net);
         ip_map_cache_destroy(net);
         rpc_proc_exit(net);
  }
@@ -59,8 +72,6 @@ static struct pernet_operations sunrpc_net_ops = {
         .size = sizeof(struct sunrpc_net),
  };
  
-extern struct cache_detail unix_gid_cache;
-
  static int __init
  init_sunrpc(void)
  {
@@ -82,7 +93,6 @@ init_sunrpc(void)
  #ifdef RPC_DEBUG
         rpc_register_sysctl();
  #endif
-       cache_register(&unix_gid_cache);
         svc_init_xprt_sock();   /* svc sock transport */
         init_socket_xprt();     /* clnt sock transport */
         return 0;
@@ -105,7 +115,6 @@ cleanup_sunrpc(void)
         svc_cleanup_xprt_sock();
         unregister_rpc_pipefs();
         rpc_destroy_mempool();
-       cache_unregister(&unix_gid_cache);
         unregister_pernet_subsys(&sunrpc_net_ops);
  #ifdef RPC_DEBUG
         rpc_unregister_sysctl();
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c

index e4aabc02368b94e0d7b0109ab7906bfbba329b23..4153846984ac72be3a0f97b1ede45799128be863 100644 (file)
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -20,6 +20,7 @@
  #include <linux/module.h>
  #include <linux/kthread.h>
  #include <linux/slab.h>
+#include <linux/nsproxy.h>
  
  #include <linux/sunrpc/types.h>
  #include <linux/sunrpc/xdr.h>
@@ -30,7 +31,7 @@
  
  #define RPCDBG_FACILITY        RPCDBG_SVCDSP
  
-static void svc_unregister(const struct svc_serv *serv);
+static void svc_unregister(const struct svc_serv *serv, struct net *net);
  
  #define svc_serv_is_pooled(serv)    ((serv)->sv_function)
  
@@ -368,23 +369,24 @@ svc_pool_for_cpu(struct svc_serv *serv, int cpu)
         return &serv->sv_pools[pidx % serv->sv_nrpools];
  }
  
-static int svc_rpcb_setup(struct svc_serv *serv)
+int svc_rpcb_setup(struct svc_serv *serv, struct net *net)
  {
         int err;
  
-       err = rpcb_create_local();
+       err = rpcb_create_local(net);
         if (err)
                 return err;
  
         /* Remove any stale portmap registrations */
-       svc_unregister(serv);
+       svc_unregister(serv, net);
         return 0;
  }
+EXPORT_SYMBOL_GPL(svc_rpcb_setup);
  
-void svc_rpcb_cleanup(struct svc_serv *serv)
+void svc_rpcb_cleanup(struct svc_serv *serv, struct net *net)
  {
-       svc_unregister(serv);
-       rpcb_put_local();
+       svc_unregister(serv, net);
+       rpcb_put_local(net);
  }
  EXPORT_SYMBOL_GPL(svc_rpcb_cleanup);
  
@@ -410,7 +412,7 @@ static int svc_uses_rpcbind(struct svc_serv *serv)
   */
  static struct svc_serv *
  __svc_create(struct svc_program *prog, unsigned int bufsize, int npools,
-            void (*shutdown)(struct svc_serv *serv))
+            void (*shutdown)(struct svc_serv *serv, struct net *net))
  {
         struct svc_serv *serv;
         unsigned int vers;
@@ -470,7 +472,7 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools,
         }
  
         if (svc_uses_rpcbind(serv)) {
-               if (svc_rpcb_setup(serv) < 0) {
+               if (svc_rpcb_setup(serv, current->nsproxy->net_ns) < 0) {
                         kfree(serv->sv_pools);
                         kfree(serv);
                         return NULL;
@@ -484,7 +486,7 @@ __svc_create(struct svc_program *prog, unsigned int bufsize, int npools,
  
  struct svc_serv *
  svc_create(struct svc_program *prog, unsigned int bufsize,
-          void (*shutdown)(struct svc_serv *serv))
+          void (*shutdown)(struct svc_serv *serv, struct net *net))
  {
         return __svc_create(prog, bufsize, /*npools*/1, shutdown);
  }
@@ -492,7 +494,7 @@ EXPORT_SYMBOL_GPL(svc_create);
  
  struct svc_serv *
  svc_create_pooled(struct svc_program *prog, unsigned int bufsize,
-                 void (*shutdown)(struct svc_serv *serv),
+                 void (*shutdown)(struct svc_serv *serv, struct net *net),
                   svc_thread_fn func, struct module *mod)
  {
         struct svc_serv *serv;
@@ -509,6 +511,24 @@ svc_create_pooled(struct svc_program *prog, unsigned int bufsize,
  }
  EXPORT_SYMBOL_GPL(svc_create_pooled);
  
+void svc_shutdown_net(struct svc_serv *serv, struct net *net)
+{
+       /*
+        * The set of xprts (contained in the sv_tempsocks and
+        * sv_permsocks lists) is now constant, since it is modified
+        * only by accepting new sockets (done by service threads in
+        * svc_recv) or aging old ones (done by sv_temptimer), or
+        * configuration changes (excluded by whatever locking the
+        * caller is using--nfsd_mutex in the case of nfsd).  So it's
+        * safe to traverse those lists and shut everything down:
+        */
+       svc_close_net(serv, net);
+
+       if (serv->sv_shutdown)
+               serv->sv_shutdown(serv, net);
+}
+EXPORT_SYMBOL_GPL(svc_shutdown_net);
+
  /*
   * Destroy an RPC service. Should be called with appropriate locking to
   * protect the sv_nrthreads, sv_permsocks and sv_tempsocks.
@@ -516,6 +536,8 @@ EXPORT_SYMBOL_GPL(svc_create_pooled);
  void
  svc_destroy(struct svc_serv *serv)
  {
+       struct net *net = current->nsproxy->net_ns;
+
         dprintk("svc: svc_destroy(%s, %d)\n",
                                 serv->sv_program->pg_name,
                                 serv->sv_nrthreads);
@@ -529,19 +551,15 @@ svc_destroy(struct svc_serv *serv)
                 printk("svc_destroy: no threads for serv=%p!\n", serv);
  
         del_timer_sync(&serv->sv_temptimer);
+
+       svc_shutdown_net(serv, net);
+
         /*
-        * The set of xprts (contained in the sv_tempsocks and
-        * sv_permsocks lists) is now constant, since it is modified
-        * only by accepting new sockets (done by service threads in
-        * svc_recv) or aging old ones (done by sv_temptimer), or
-        * configuration changes (excluded by whatever locking the
-        * caller is using--nfsd_mutex in the case of nfsd).  So it's
-        * safe to traverse those lists and shut everything down:
+        * The last user is gone and thus all sockets have to be destroyed to
+        * the point. Check this.
          */
-       svc_close_all(serv);
-
-       if (serv->sv_shutdown)
-               serv->sv_shutdown(serv);
+       BUG_ON(!list_empty(&serv->sv_permsocks));
+       BUG_ON(!list_empty(&serv->sv_tempsocks));
  
         cache_clean_deferred(serv);
  
@@ -795,7 +813,8 @@ EXPORT_SYMBOL_GPL(svc_exit_thread);
   * Returns zero on success; a negative errno value is returned
   * if any error occurs.
   */
-static int __svc_rpcb_register4(const u32 program, const u32 version,
+static int __svc_rpcb_register4(struct net *net, const u32 program,
+                               const u32 version,
                                 const unsigned short protocol,
                                 const unsigned short port)
  {
@@ -818,7 +837,7 @@ static int __svc_rpcb_register4(const u32 program, const u32 version,
                 return -ENOPROTOOPT;
         }
  
-       error = rpcb_v4_register(program, version,
+       error = rpcb_v4_register(net, program, version,
                                         (const struct sockaddr *)&sin, netid);
  
         /*
@@ -826,7 +845,7 @@ static int __svc_rpcb_register4(const u32 program, const u32 version,
          * registration request with the legacy rpcbind v2 protocol.
          */
         if (error == -EPROTONOSUPPORT)
-               error = rpcb_register(program, version, protocol, port);
+               error = rpcb_register(net, program, version, protocol, port);
  
         return error;
  }
@@ -842,7 +861,8 @@ static int __svc_rpcb_register4(const u32 program, const u32 version,
   * Returns zero on success; a negative errno value is returned
   * if any error occurs.
   */
-static int __svc_rpcb_register6(const u32 program, const u32 version,
+static int __svc_rpcb_register6(struct net *net, const u32 program,
+                               const u32 version,
                                 const unsigned short protocol,
                                 const unsigned short port)
  {
@@ -865,7 +885,7 @@ static int __svc_rpcb_register6(const u32 program, const u32 version,
                 return -ENOPROTOOPT;
         }
  
-       error = rpcb_v4_register(program, version,
+       error = rpcb_v4_register(net, program, version,
                                         (const struct sockaddr *)&sin6, netid);
  
         /*
@@ -885,7 +905,7 @@ static int __svc_rpcb_register6(const u32 program, const u32 version,
   * Returns zero on success; a negative errno value is returned
   * if any error occurs.
   */
-static int __svc_register(const char *progname,
+static int __svc_register(struct net *net, const char *progname,
                           const u32 program, const u32 version,
                           const int family,
                           const unsigned short protocol,
@@ -895,12 +915,12 @@ static int __svc_register(const char *progname,
  
         switch (family) {
         case PF_INET:
-               error = __svc_rpcb_register4(program, version,
+               error = __svc_rpcb_register4(net, program, version,
                                                 protocol, port);
                 break;
  #if IS_ENABLED(CONFIG_IPV6)
         case PF_INET6:
-               error = __svc_rpcb_register6(program, version,
+               error = __svc_rpcb_register6(net, program, version,
                                                 protocol, port);
  #endif
         }
@@ -914,14 +934,16 @@ static int __svc_register(const char *progname,
  /**
   * svc_register - register an RPC service with the local portmapper
   * @serv: svc_serv struct for the service to register
+ * @net: net namespace for the service to register
   * @family: protocol family of service's listener socket
   * @proto: transport protocol number to advertise
   * @port: port to advertise
   *
   * Service is registered for any address in the passed-in protocol family
   */
-int svc_register(const struct svc_serv *serv, const int family,
-                const unsigned short proto, const unsigned short port)
+int svc_register(const struct svc_serv *serv, struct net *net,
+                const int family, const unsigned short proto,
+                const unsigned short port)
  {
         struct svc_program      *progp;
         unsigned int            i;
@@ -946,7 +968,7 @@ int svc_register(const struct svc_serv *serv, const int family,
                         if (progp->pg_vers[i]->vs_hidden)
                                 continue;
  
-                       error = __svc_register(progp->pg_name, progp->pg_prog,
+                       error = __svc_register(net, progp->pg_name, progp->pg_prog,
                                                 i, family, proto, port);
                         if (error < 0)
                                 break;
@@ -963,19 +985,19 @@ int svc_register(const struct svc_serv *serv, const int family,
   * any "inet6" entries anyway.  So a PMAP_UNSET should be sufficient
   * in this case to clear all existing entries for [program, version].
   */
-static void __svc_unregister(const u32 program, const u32 version,
+static void __svc_unregister(struct net *net, const u32 program, const u32 version,
                              const char *progname)
  {
         int error;
  
-       error = rpcb_v4_register(program, version, NULL, "");
+       error = rpcb_v4_register(net, program, version, NULL, "");
  
         /*
          * User space didn't support rpcbind v4, so retry this
          * request with the legacy rpcbind v2 protocol.
          */
         if (error == -EPROTONOSUPPORT)
-               error = rpcb_register(program, version, 0, 0);
+               error = rpcb_register(net, program, version, 0, 0);
  
         dprintk("svc: %s(%sv%u), error %d\n",
                         __func__, progname, version, error);
@@ -989,7 +1011,7 @@ static void __svc_unregister(const u32 program, const u32 version,
   * The result of unregistration is reported via dprintk for those who want
   * verification of the result, but is otherwise not important.
   */
-static void svc_unregister(const struct svc_serv *serv)
+static void svc_unregister(const struct svc_serv *serv, struct net *net)
  {
         struct svc_program *progp;
         unsigned long flags;
@@ -1006,7 +1028,7 @@ static void svc_unregister(const struct svc_serv *serv)
  
                         dprintk("svc: attempting to unregister %sv%u\n",
                                 progp->pg_name, i);
-                       __svc_unregister(progp->pg_prog, i, progp->pg_name);
+                       __svc_unregister(net, progp->pg_prog, i, progp->pg_name);
                 }
         }
  
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c

index 74cb0d8e9ca1f58aae66b85bee5c313cf3473f18..4bda09d7e1a4cc6e5c5eaf24eec5b171ae922613 100644 (file)
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -922,48 +922,65 @@ void svc_close_xprt(struct svc_xprt *xprt)
  }
  EXPORT_SYMBOL_GPL(svc_close_xprt);
  
-static void svc_close_list(struct list_head *xprt_list)
+static void svc_close_list(struct list_head *xprt_list, struct net *net)
  {
         struct svc_xprt *xprt;
  
         list_for_each_entry(xprt, xprt_list, xpt_list) {
+               if (xprt->xpt_net != net)
+                       continue;
                 set_bit(XPT_CLOSE, &xprt->xpt_flags);
                 set_bit(XPT_BUSY, &xprt->xpt_flags);
         }
  }
  
-void svc_close_all(struct svc_serv *serv)
+static void svc_clear_pools(struct svc_serv *serv, struct net *net)
  {
         struct svc_pool *pool;
         struct svc_xprt *xprt;
         struct svc_xprt *tmp;
         int i;
  
-       svc_close_list(&serv->sv_tempsocks);
-       svc_close_list(&serv->sv_permsocks);
-
         for (i = 0; i < serv->sv_nrpools; i++) {
                 pool = &serv->sv_pools[i];
  
                 spin_lock_bh(&pool->sp_lock);
-               while (!list_empty(&pool->sp_sockets)) {
-                       xprt = list_first_entry(&pool->sp_sockets, struct svc_xprt, xpt_ready);
+               list_for_each_entry_safe(xprt, tmp, &pool->sp_sockets, xpt_ready) {
+                       if (xprt->xpt_net != net)
+                               continue;
                         list_del_init(&xprt->xpt_ready);
                 }
                 spin_unlock_bh(&pool->sp_lock);
         }
+}
+
+static void svc_clear_list(struct list_head *xprt_list, struct net *net)
+{
+       struct svc_xprt *xprt;
+       struct svc_xprt *tmp;
+
+       list_for_each_entry_safe(xprt, tmp, xprt_list, xpt_list) {
+               if (xprt->xpt_net != net)
+                       continue;
+               svc_delete_xprt(xprt);
+       }
+       list_for_each_entry(xprt, xprt_list, xpt_list)
+               BUG_ON(xprt->xpt_net == net);
+}
+
+void svc_close_net(struct svc_serv *serv, struct net *net)
+{
+       svc_close_list(&serv->sv_tempsocks, net);
+       svc_close_list(&serv->sv_permsocks, net);
+
+       svc_clear_pools(serv, net);
         /*
          * At this point the sp_sockets lists will stay empty, since
          * svc_enqueue will not add new entries without taking the
          * sp_lock and checking XPT_BUSY.
          */
-       list_for_each_entry_safe(xprt, tmp, &serv->sv_tempsocks, xpt_list)
-               svc_delete_xprt(xprt);
-       list_for_each_entry_safe(xprt, tmp, &serv->sv_permsocks, xpt_list)
-               svc_delete_xprt(xprt);
-
-       BUG_ON(!list_empty(&serv->sv_permsocks));
-       BUG_ON(!list_empty(&serv->sv_tempsocks));
+       svc_clear_list(&serv->sv_tempsocks, net);
+       svc_clear_list(&serv->sv_permsocks, net);
  }
  
  /*
@@ -1089,6 +1106,7 @@ static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
   * svc_find_xprt - find an RPC transport instance
   * @serv: pointer to svc_serv to search
   * @xcl_name: C string containing transport's class name
+ * @net: owner net pointer
   * @af: Address family of transport's local address
   * @port: transport's IP port number
   *
@@ -1101,7 +1119,8 @@ static struct svc_deferred_req *svc_deferred_dequeue(struct svc_xprt *xprt)
   * service's list that has a matching class name.
   */
  struct svc_xprt *svc_find_xprt(struct svc_serv *serv, const char *xcl_name,
-                              const sa_family_t af, const unsigned short port)
+                              struct net *net, const sa_family_t af,
+                              const unsigned short port)
  {
         struct svc_xprt *xprt;
         struct svc_xprt *found = NULL;
@@ -1112,6 +1131,8 @@ struct svc_xprt *svc_find_xprt(struct svc_serv *serv, const char *xcl_name,
  
         spin_lock_bh(&serv->sv_lock);
         list_for_each_entry(xprt, &serv->sv_permsocks, xpt_list) {
+               if (xprt->xpt_net != net)
+                       continue;
                 if (strcmp(xprt->xpt_class->xcl_name, xcl_name))
                         continue;
                 if (af != AF_UNSPEC && af != xprt->xpt_local.ss_family)
diff --git a/net/sunrpc/svcauth_unix.c b/net/sunrpc/svcauth_unix.c

index 01153ead1dbaf3d982e4d57bfacccce674b507a4..bcd574f2ac566a96c34b041f44ce1e0b1bedbed8 100644 (file)
--- a/net/sunrpc/svcauth_unix.c
+++ b/net/sunrpc/svcauth_unix.c
@@ -211,7 +211,7 @@ static int ip_map_parse(struct cache_detail *cd,
         len = qword_get(&mesg, buf, mlen);
         if (len <= 0) return -EINVAL;
  
-       if (rpc_pton(buf, len, &address.sa, sizeof(address)) == 0)
+       if (rpc_pton(cd->net, buf, len, &address.sa, sizeof(address)) == 0)
                 return -EINVAL;
         switch (address.sa.sa_family) {
         case AF_INET:
@@ -436,7 +436,6 @@ struct unix_gid {
         uid_t                   uid;
         struct group_info       *gi;
  };
-static struct cache_head       *gid_table[GID_HASHMAX];
  
  static void unix_gid_put(struct kref *kref)
  {
@@ -494,8 +493,7 @@ static int unix_gid_upcall(struct cache_detail *cd, struct cache_head *h)
         return sunrpc_cache_pipe_upcall(cd, h, unix_gid_request);
  }
  
-static struct unix_gid *unix_gid_lookup(uid_t uid);
-extern struct cache_detail unix_gid_cache;
+static struct unix_gid *unix_gid_lookup(struct cache_detail *cd, uid_t uid);
  
  static int unix_gid_parse(struct cache_detail *cd,
                         char *mesg, int mlen)
@@ -539,19 +537,19 @@ static int unix_gid_parse(struct cache_detail *cd,
                 GROUP_AT(ug.gi, i) = gid;
         }
  
-       ugp = unix_gid_lookup(uid);
+       ugp = unix_gid_lookup(cd, uid);
         if (ugp) {
                 struct cache_head *ch;
                 ug.h.flags = 0;
                 ug.h.expiry_time = expiry;
-               ch = sunrpc_cache_update(&unix_gid_cache,
+               ch = sunrpc_cache_update(cd,
                                          &ug.h, &ugp->h,
                                          hash_long(uid, GID_HASHBITS));
                 if (!ch)
                         err = -ENOMEM;
                 else {
                         err = 0;
-                       cache_put(ch, &unix_gid_cache);
+                       cache_put(ch, cd);
                 }
         } else
                 err = -ENOMEM;
@@ -587,10 +585,9 @@ static int unix_gid_show(struct seq_file *m,
         return 0;
  }
  
-struct cache_detail unix_gid_cache = {
+static struct cache_detail unix_gid_cache_template = {
         .owner          = THIS_MODULE,
         .hash_size      = GID_HASHMAX,
-       .hash_table     = gid_table,
         .name           = "auth.unix.gid",
         .cache_put      = unix_gid_put,
         .cache_upcall   = unix_gid_upcall,
@@ -602,14 +599,42 @@ struct cache_detail unix_gid_cache = {
         .alloc          = unix_gid_alloc,
  };
  
-static struct unix_gid *unix_gid_lookup(uid_t uid)
+int unix_gid_cache_create(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd;
+       int err;
+
+       cd = cache_create_net(&unix_gid_cache_template, net);
+       if (IS_ERR(cd))
+               return PTR_ERR(cd);
+       err = cache_register_net(cd, net);
+       if (err) {
+               cache_destroy_net(cd, net);
+               return err;
+       }
+       sn->unix_gid_cache = cd;
+       return 0;
+}
+
+void unix_gid_cache_destroy(struct net *net)
+{
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd = sn->unix_gid_cache;
+
+       sn->unix_gid_cache = NULL;
+       cache_purge(cd);
+       cache_unregister_net(cd, net);
+       cache_destroy_net(cd, net);
+}
+
+static struct unix_gid *unix_gid_lookup(struct cache_detail *cd, uid_t uid)
  {
         struct unix_gid ug;
         struct cache_head *ch;
  
         ug.uid = uid;
-       ch = sunrpc_cache_lookup(&unix_gid_cache, &ug.h,
-                                hash_long(uid, GID_HASHBITS));
+       ch = sunrpc_cache_lookup(cd, &ug.h, hash_long(uid, GID_HASHBITS));
         if (ch)
                 return container_of(ch, struct unix_gid, h);
         else
@@ -621,11 +646,13 @@ static struct group_info *unix_gid_find(uid_t uid, struct svc_rqst *rqstp)
         struct unix_gid *ug;
         struct group_info *gi;
         int ret;
+       struct sunrpc_net *sn = net_generic(rqstp->rq_xprt->xpt_net,
+                                           sunrpc_net_id);
  
-       ug = unix_gid_lookup(uid);
+       ug = unix_gid_lookup(sn->unix_gid_cache, uid);
         if (!ug)
                 return ERR_PTR(-EAGAIN);
-       ret = cache_check(&unix_gid_cache, &ug->h, &rqstp->rq_chandle);
+       ret = cache_check(sn->unix_gid_cache, &ug->h, &rqstp->rq_chandle);
         switch (ret) {
         case -ENOENT:
                 return ERR_PTR(-ENOENT);
@@ -633,7 +660,7 @@ static struct group_info *unix_gid_find(uid_t uid, struct svc_rqst *rqstp)
                 return ERR_PTR(-ESHUTDOWN);
         case 0:
                 gi = get_group_info(ug->gi);
-               cache_put(&ug->h, &unix_gid_cache);
+               cache_put(&ug->h, sn->unix_gid_cache);
                 return gi;
         default:
                 return ERR_PTR(-EAGAIN);
@@ -849,56 +876,45 @@ struct auth_ops svcauth_unix = {
         .set_client     = svcauth_unix_set_client,
  };
  
+static struct cache_detail ip_map_cache_template = {
+       .owner          = THIS_MODULE,
+       .hash_size      = IP_HASHMAX,
+       .name           = "auth.unix.ip",
+       .cache_put      = ip_map_put,
+       .cache_upcall   = ip_map_upcall,
+       .cache_parse    = ip_map_parse,
+       .cache_show     = ip_map_show,
+       .match          = ip_map_match,
+       .init           = ip_map_init,
+       .update         = update,
+       .alloc          = ip_map_alloc,
+};
+
  int ip_map_cache_create(struct net *net)
  {
-       int err = -ENOMEM;
-       struct cache_detail *cd;
-       struct cache_head **tbl;
         struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd;
+       int err;
  
-       cd = kzalloc(sizeof(struct cache_detail), GFP_KERNEL);
-       if (cd == NULL)
-               goto err_cd;
-
-       tbl = kzalloc(IP_HASHMAX * sizeof(struct cache_head *), GFP_KERNEL);
-       if (tbl == NULL)
-               goto err_tbl;
-
-       cd->owner = THIS_MODULE,
-       cd->hash_size = IP_HASHMAX,
-       cd->hash_table = tbl,
-       cd->name = "auth.unix.ip",
-       cd->cache_put = ip_map_put,
-       cd->cache_upcall = ip_map_upcall,
-       cd->cache_parse = ip_map_parse,
-       cd->cache_show = ip_map_show,
-       cd->match = ip_map_match,
-       cd->init = ip_map_init,
-       cd->update = update,
-       cd->alloc = ip_map_alloc,
-
+       cd = cache_create_net(&ip_map_cache_template, net);
+       if (IS_ERR(cd))
+               return PTR_ERR(cd);
         err = cache_register_net(cd, net);
-       if (err)
-               goto err_reg;
-
+       if (err) {
+               cache_destroy_net(cd, net);
+               return err;
+       }
         sn->ip_map_cache = cd;
         return 0;
-
-err_reg:
-       kfree(tbl);
-err_tbl:
-       kfree(cd);
-err_cd:
-       return err;
  }
  
  void ip_map_cache_destroy(struct net *net)
  {
-       struct sunrpc_net *sn;
+       struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+       struct cache_detail *cd = sn->ip_map_cache;
  
-       sn = net_generic(net, sunrpc_net_id);
-       cache_purge(sn->ip_map_cache);
-       cache_unregister_net(sn->ip_map_cache, net);
-       kfree(sn->ip_map_cache->hash_table);
-       kfree(sn->ip_map_cache);
+       sn->ip_map_cache = NULL;
+       cache_purge(cd);
+       cache_unregister_net(cd, net);
+       cache_destroy_net(cd, net);
  }
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c

index 464570906f80c24190260bef957a53128202a486..40ae884db865f975f589a433432652d0fe1936ed 100644 (file)
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -396,7 +396,7 @@ static int svc_partial_recvfrom(struct svc_rqst *rqstp,
                                 int buflen, unsigned int base)
  {
         size_t save_iovlen;
-       void __user *save_iovbase;
+       void *save_iovbase;
         unsigned int i;
         int ret;
  
@@ -1409,7 +1409,8 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv,
  
         /* Register socket with portmapper */
         if (*errp >= 0 && pmap_register)
-               *errp = svc_register(serv, inet->sk_family, inet->sk_protocol,
+               *errp = svc_register(serv, sock_net(sock->sk), inet->sk_family,
+                                    inet->sk_protocol,
                                      ntohs(inet_sk(inet)->inet_sport));
  
         if (*errp < 0) {
diff --git a/net/sunrpc/sysctl.c b/net/sunrpc/sysctl.c

index e65dcc613339a932f4796467c11d8ff6ffd5f488..af7d339add9d5b853174ef8cac0a6016419fc663 100644 (file)
--- a/net/sunrpc/sysctl.c
+++ b/net/sunrpc/sysctl.c
@@ -20,6 +20,8 @@
  #include <linux/sunrpc/stats.h>
  #include <linux/sunrpc/svc_xprt.h>
  
+#include "netns.h"
+
  /*
   * Declare the debug flags here
   */
@@ -110,7 +112,7 @@ proc_dodebug(ctl_table *table, int write,
                 *(unsigned int *) table->data = value;
                 /* Display the RPC tasks on writing to rpc_debug */
                 if (strcmp(table->procname, "rpc_debug") == 0)
-                       rpc_show_tasks();
+                       rpc_show_tasks(&init_net);
         } else {
                 if (!access_ok(VERIFY_WRITE, buffer, left))
                         return -EFAULT;
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c

index c64c0ef519b594320ff688f3881579d2926be21d..0cbcd1ab49ab5544952d3385b64b2e29b1843872 100644 (file)
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -66,6 +66,7 @@ static void    xprt_init(struct rpc_xprt *xprt, struct net *net);
  static void    xprt_request_init(struct rpc_task *, struct rpc_xprt *);
  static void    xprt_connect_status(struct rpc_task *task);
  static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
+static void     xprt_destroy(struct rpc_xprt *xprt);
  
  static DEFINE_SPINLOCK(xprt_list_lock);
  static LIST_HEAD(xprt_list);
@@ -292,54 +293,57 @@ static inline int xprt_lock_write(struct rpc_xprt *xprt, struct rpc_task *task)
         return retval;
  }
  
-static void __xprt_lock_write_next(struct rpc_xprt *xprt)
+static bool __xprt_lock_write_func(struct rpc_task *task, void *data)
  {
-       struct rpc_task *task;
+       struct rpc_xprt *xprt = data;
         struct rpc_rqst *req;
  
-       if (test_and_set_bit(XPRT_LOCKED, &xprt->state))
-               return;
-
-       task = rpc_wake_up_next(&xprt->sending);
-       if (task == NULL)
-               goto out_unlock;
-
         req = task->tk_rqstp;
         xprt->snd_task = task;
         if (req) {
                 req->rq_bytes_sent = 0;
                 req->rq_ntrans++;
         }
-       return;
+       return true;
+}
  
-out_unlock:
+static void __xprt_lock_write_next(struct rpc_xprt *xprt)
+{
+       if (test_and_set_bit(XPRT_LOCKED, &xprt->state))
+               return;
+
+       if (rpc_wake_up_first(&xprt->sending, __xprt_lock_write_func, xprt))
+               return;
         xprt_clear_locked(xprt);
  }
  
-static void __xprt_lock_write_next_cong(struct rpc_xprt *xprt)
+static bool __xprt_lock_write_cong_func(struct rpc_task *task, void *data)
  {
-       struct rpc_task *task;
+       struct rpc_xprt *xprt = data;
         struct rpc_rqst *req;
  
-       if (test_and_set_bit(XPRT_LOCKED, &xprt->state))
-               return;
-       if (RPCXPRT_CONGESTED(xprt))
-               goto out_unlock;
-       task = rpc_wake_up_next(&xprt->sending);
-       if (task == NULL)
-               goto out_unlock;
-
         req = task->tk_rqstp;
         if (req == NULL) {
                 xprt->snd_task = task;
-               return;
+               return true;
         }
         if (__xprt_get_cong(xprt, task)) {
                 xprt->snd_task = task;
                 req->rq_bytes_sent = 0;
                 req->rq_ntrans++;
-               return;
+               return true;
         }
+       return false;
+}
+
+static void __xprt_lock_write_next_cong(struct rpc_xprt *xprt)
+{
+       if (test_and_set_bit(XPRT_LOCKED, &xprt->state))
+               return;
+       if (RPCXPRT_CONGESTED(xprt))
+               goto out_unlock;
+       if (rpc_wake_up_first(&xprt->sending, __xprt_lock_write_cong_func, xprt))
+               return;
  out_unlock:
         xprt_clear_locked(xprt);
  }
@@ -712,9 +716,7 @@ void xprt_connect(struct rpc_task *task)
         if (xprt_connected(xprt))
                 xprt_release_write(xprt, task);
         else {
-               if (task->tk_rqstp)
-                       task->tk_rqstp->rq_bytes_sent = 0;
-
+               task->tk_rqstp->rq_bytes_sent = 0;
                 task->tk_timeout = task->tk_rqstp->rq_timeout;
                 rpc_sleep_on(&xprt->pending, task, xprt_connect_status);
  
@@ -750,7 +752,7 @@ static void xprt_connect_status(struct rpc_task *task)
         default:
                 dprintk("RPC: %5u xprt_connect_status: error %d connecting to "
                                 "server %s\n", task->tk_pid, -task->tk_status,
-                               task->tk_client->cl_server);
+                               xprt->servername);
                 xprt_release_write(xprt, task);
                 task->tk_status = -EIO;
         }
@@ -884,7 +886,7 @@ void xprt_transmit(struct rpc_task *task)
  {
         struct rpc_rqst *req = task->tk_rqstp;
         struct rpc_xprt *xprt = req->rq_xprt;
-       int status;
+       int status, numreqs;
  
         dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, req->rq_slen);
  
@@ -921,9 +923,14 @@ void xprt_transmit(struct rpc_task *task)
  
         xprt->ops->set_retrans_timeout(task);
  
+       numreqs = atomic_read(&xprt->num_reqs);
+       if (numreqs > xprt->stat.max_slots)
+               xprt->stat.max_slots = numreqs;
         xprt->stat.sends++;
         xprt->stat.req_u += xprt->stat.sends - xprt->stat.recvs;
         xprt->stat.bklog_u += xprt->backlog.qlen;
+       xprt->stat.sending_u += xprt->sending.qlen;
+       xprt->stat.pending_u += xprt->pending.qlen;
  
         /* Don't race with disconnect */
         if (!xprt_connected(xprt))
@@ -1131,7 +1138,10 @@ void xprt_release(struct rpc_task *task)
                 return;
  
         xprt = req->rq_xprt;
-       rpc_count_iostats(task);
+       if (task->tk_ops->rpc_count_stats != NULL)
+               task->tk_ops->rpc_count_stats(task, task->tk_calldata);
+       else if (task->tk_client)
+               rpc_count_iostats(task, task->tk_client->cl_metrics);
         spin_lock_bh(&xprt->transport_lock);
         xprt->ops->release_xprt(xprt, task);
         if (xprt->ops->release_request)
@@ -1220,6 +1230,17 @@ found:
                             (unsigned long)xprt);
         else
                 init_timer(&xprt->timer);
+
+       if (strlen(args->servername) > RPC_MAXNETNAMELEN) {
+               xprt_destroy(xprt);
+               return ERR_PTR(-EINVAL);
+       }
+       xprt->servername = kstrdup(args->servername, GFP_KERNEL);
+       if (xprt->servername == NULL) {
+               xprt_destroy(xprt);
+               return ERR_PTR(-ENOMEM);
+       }
+
         dprintk("RPC:       created transport %p with %u slots\n", xprt,
                         xprt->max_reqs);
  out:
@@ -1242,6 +1263,7 @@ static void xprt_destroy(struct rpc_xprt *xprt)
         rpc_destroy_wait_queue(&xprt->sending);
         rpc_destroy_wait_queue(&xprt->backlog);
         cancel_work_sync(&xprt->task_cleanup);
+       kfree(xprt->servername);
         /*
          * Tear down transport state and free the rpc_xprt
          */
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c

index 1776e5731dcf1005a7a163867f3132cacd1a7026..558fbab574f00eadf0d52d91ef82e23e858b0dee 100644 (file)
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -771,13 +771,18 @@ repost:
  
         /* get request object */
         req = rpcr_to_rdmar(rqst);
+       if (req->rl_reply) {
+               spin_unlock(&xprt->transport_lock);
+               dprintk("RPC:       %s: duplicate reply 0x%p to RPC "
+                       "request 0x%p: xid 0x%08x\n", __func__, rep, req,
+                       headerp->rm_xid);
+               goto repost;
+       }
  
         dprintk("RPC:       %s: reply 0x%p completes request 0x%p\n"
                 "                   RPC request 0x%p xid 0x%08x\n",
                         __func__, rep, req, rqst, headerp->rm_xid);
  
-       BUG_ON(!req || req->rl_reply);
-
         /* from here on, the reply is no longer an orphan */
         req->rl_reply = rep;
  
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c

index 28236bab57f929e1edadd8245e5851a2fb925bc2..745973b729af6af33d8a882f45f6a3ba62f62c2e 100644 (file)
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1490,6 +1490,9 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
         u8 key;
         int len, pageoff;
         int i, rc;
+       int seg_len;
+       u64 pa;
+       int page_no;
  
         pageoff = offset_in_page(seg1->mr_offset);
         seg1->mr_offset -= pageoff;     /* start of page */
@@ -1497,11 +1500,15 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
         len = -pageoff;
         if (*nsegs > RPCRDMA_MAX_DATA_SEGS)
                 *nsegs = RPCRDMA_MAX_DATA_SEGS;
-       for (i = 0; i < *nsegs;) {
+       for (page_no = i = 0; i < *nsegs;) {
                 rpcrdma_map_one(ia, seg, writing);
-               seg1->mr_chunk.rl_mw->r.frmr.fr_pgl->page_list[i] = seg->mr_dma;
+               pa = seg->mr_dma;
+               for (seg_len = seg->mr_len; seg_len > 0; seg_len -= PAGE_SIZE) {
+                       seg1->mr_chunk.rl_mw->r.frmr.fr_pgl->
+                               page_list[page_no++] = pa;
+                       pa += PAGE_SIZE;
+               }
                 len += seg->mr_len;
-               BUG_ON(seg->mr_len > PAGE_SIZE);
                 ++seg;
                 ++i;
                 /* Check for holes */
@@ -1540,9 +1547,9 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
         frmr_wr.send_flags = IB_SEND_SIGNALED;
         frmr_wr.wr.fast_reg.iova_start = seg1->mr_dma;
         frmr_wr.wr.fast_reg.page_list = seg1->mr_chunk.rl_mw->r.frmr.fr_pgl;
-       frmr_wr.wr.fast_reg.page_list_len = i;
+       frmr_wr.wr.fast_reg.page_list_len = page_no;
         frmr_wr.wr.fast_reg.page_shift = PAGE_SHIFT;
-       frmr_wr.wr.fast_reg.length = i << PAGE_SHIFT;
+       frmr_wr.wr.fast_reg.length = page_no << PAGE_SHIFT;
         BUG_ON(frmr_wr.wr.fast_reg.length < len);
         frmr_wr.wr.fast_reg.access_flags = (writing ?
                                 IB_ACCESS_REMOTE_WRITE | IB_ACCESS_LOCAL_WRITE :
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c

index 55472c48825e6fd43c3357a2f58b398231a2767e..92bc5181dbebde6e82af566ed54f4ca04a61b857 100644 (file)
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -53,12 +53,12 @@ static void xs_close(struct rpc_xprt *xprt);
  /*
   * xprtsock tunables
   */
-unsigned int xprt_udp_slot_table_entries = RPC_DEF_SLOT_TABLE;
-unsigned int xprt_tcp_slot_table_entries = RPC_MIN_SLOT_TABLE;
-unsigned int xprt_max_tcp_slot_table_entries = RPC_MAX_SLOT_TABLE;
+static unsigned int xprt_udp_slot_table_entries = RPC_DEF_SLOT_TABLE;
+static unsigned int xprt_tcp_slot_table_entries = RPC_MIN_SLOT_TABLE;
+static unsigned int xprt_max_tcp_slot_table_entries = RPC_MAX_SLOT_TABLE;
  
-unsigned int xprt_min_resvport = RPC_DEF_MIN_RESVPORT;
-unsigned int xprt_max_resvport = RPC_DEF_MAX_RESVPORT;
+static unsigned int xprt_min_resvport = RPC_DEF_MIN_RESVPORT;
+static unsigned int xprt_max_resvport = RPC_DEF_MAX_RESVPORT;
  
  #define XS_TCP_LINGER_TO       (15U * HZ)
  static unsigned int xs_tcp_fin_timeout __read_mostly = XS_TCP_LINGER_TO;
@@ -2227,7 +2227,7 @@ static void xs_local_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
                 idle_time = (long)(jiffies - xprt->last_used) / HZ;
  
         seq_printf(seq, "\txprt:\tlocal %lu %lu %lu %ld %lu %lu %lu "
-                       "%llu %llu\n",
+                       "%llu %llu %lu %llu %llu\n",
                         xprt->stat.bind_count,
                         xprt->stat.connect_count,
                         xprt->stat.connect_time,
@@ -2236,7 +2236,10 @@ static void xs_local_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
                         xprt->stat.recvs,
                         xprt->stat.bad_xids,
                         xprt->stat.req_u,
-                       xprt->stat.bklog_u);
+                       xprt->stat.bklog_u,
+                       xprt->stat.max_slots,
+                       xprt->stat.sending_u,
+                       xprt->stat.pending_u);
  }
  
  /**
@@ -2249,14 +2252,18 @@ static void xs_udp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
  {
         struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
  
-       seq_printf(seq, "\txprt:\tudp %u %lu %lu %lu %lu %Lu %Lu\n",
+       seq_printf(seq, "\txprt:\tudp %u %lu %lu %lu %lu %llu %llu "
+                       "%lu %llu %llu\n",
                         transport->srcport,
                         xprt->stat.bind_count,
                         xprt->stat.sends,
                         xprt->stat.recvs,
                         xprt->stat.bad_xids,
                         xprt->stat.req_u,
-                       xprt->stat.bklog_u);
+                       xprt->stat.bklog_u,
+                       xprt->stat.max_slots,
+                       xprt->stat.sending_u,
+                       xprt->stat.pending_u);
  }
  
  /**
@@ -2273,7 +2280,8 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
         if (xprt_connected(xprt))
                 idle_time = (long)(jiffies - xprt->last_used) / HZ;
  
-       seq_printf(seq, "\txprt:\ttcp %u %lu %lu %lu %ld %lu %lu %lu %Lu %Lu\n",
+       seq_printf(seq, "\txprt:\ttcp %u %lu %lu %lu %ld %lu %lu %lu "
+                       "%llu %llu %lu %llu %llu\n",
                         transport->srcport,
                         xprt->stat.bind_count,
                         xprt->stat.connect_count,
@@ -2283,7 +2291,10 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
                         xprt->stat.recvs,
                         xprt->stat.bad_xids,
                         xprt->stat.req_u,
-                       xprt->stat.bklog_u);
+                       xprt->stat.bklog_u,
+                       xprt->stat.max_slots,
+                       xprt->stat.sending_u,
+                       xprt->stat.pending_u);
  }
  
  /*
diff --git a/security/keys/key.c b/security/keys/key.c

index 7ada8019be1f2c08314fe0c8c14d3058a1ee26ea..06783cffb3afa631fcca8efee90d17cc4f711b2c 100644 (file)
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -671,6 +671,26 @@ found_kernel_type:
         return ktype;
  }
  
+void key_set_timeout(struct key *key, unsigned timeout)
+{
+       struct timespec now;
+       time_t expiry = 0;
+
+       /* make the changes with the locks held to prevent races */
+       down_write(&key->sem);
+
+       if (timeout > 0) {
+               now = current_kernel_time();
+               expiry = now.tv_sec + timeout;
+       }
+
+       key->expiry = expiry;
+       key_schedule_gc(key->expiry + key_gc_delay);
+
+       up_write(&key->sem);
+}
+EXPORT_SYMBOL_GPL(key_set_timeout);
+
  /*
   * Unlock a key type locked by key_type_lookup().
   */
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c

index 6523599e9ac0e08d7911c1fb007cc1971d1334a6..fb767c6cd99f6a92da8fa989999a18988bce1a19 100644 (file)
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -14,6 +14,7 @@
  #include <linux/sched.h>
  #include <linux/slab.h>
  #include <linux/syscalls.h>
+#include <linux/key.h>
  #include <linux/keyctl.h>
  #include <linux/fs.h>
  #include <linux/capability.h>
@@ -1257,10 +1258,8 @@ error:
   */
  long keyctl_set_timeout(key_serial_t id, unsigned timeout)
  {
-       struct timespec now;
         struct key *key, *instkey;
         key_ref_t key_ref;
-       time_t expiry;
         long ret;
  
         key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE | KEY_LOOKUP_PARTIAL,
@@ -1286,20 +1285,7 @@ long keyctl_set_timeout(key_serial_t id, unsigned timeout)
  
  okay:
         key = key_ref_to_ptr(key_ref);
-
-       /* make the changes with the locks held to prevent races */
-       down_write(&key->sem);
-
-       expiry = 0;
-       if (timeout > 0) {
-               now = current_kernel_time();
-               expiry = now.tv_sec + timeout;
-       }
-
-       key->expiry = expiry;
-       key_schedule_gc(key->expiry + key_gc_delay);
-
-       up_write(&key->sem);
+       key_set_timeout(key, timeout);
         key_put(key);
  
         ret = 0;
author	Linus Torvalds <torvalds@linux-foundation.org>
	Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)
committer	Linus Torvalds <torvalds@linux-foundation.org>
	Fri, 23 Mar 2012 15:53:47 +0000 (08:53 -0700)
Documentation/filesystems/nfs/idmapper.txt		patch \| blob \| history
Documentation/filesystems/nfs/pnfs.txt		patch \| blob \| history
Documentation/kernel-parameters.txt		patch \| blob \| history
fs/lockd/clnt4xdr.c		patch \| blob \| history
fs/lockd/clntlock.c		patch \| blob \| history
fs/lockd/clntxdr.c		patch \| blob \| history
fs/lockd/host.c		patch \| blob \| history
fs/lockd/mon.c		patch \| blob \| history
fs/lockd/netns.h	[new file with mode: 0644]	patch \| blob
fs/lockd/svc.c		patch \| blob \| history
fs/lockd/svclock.c		patch \| blob \| history
fs/nfs/Kconfig		patch \| blob \| history
fs/nfs/blocklayout/blocklayout.c		patch \| blob \| history
fs/nfs/blocklayout/blocklayout.h		patch \| blob \| history
fs/nfs/blocklayout/blocklayoutdev.c		patch \| blob \| history
fs/nfs/blocklayout/blocklayoutdm.c		patch \| blob \| history
fs/nfs/blocklayout/extents.c		patch \| blob \| history
fs/nfs/cache_lib.c		patch \| blob \| history
fs/nfs/cache_lib.h		patch \| blob \| history
fs/nfs/callback.c		patch \| blob \| history
fs/nfs/callback.h		patch \| blob \| history
fs/nfs/callback_proc.c		patch \| blob \| history
fs/nfs/callback_xdr.c		patch \| blob \| history
fs/nfs/client.c		patch \| blob \| history
fs/nfs/delegation.c		patch \| blob \| history
fs/nfs/delegation.h		patch \| blob \| history
fs/nfs/dir.c		patch \| blob \| history
fs/nfs/direct.c		patch \| blob \| history
fs/nfs/dns_resolve.c		patch \| blob \| history
fs/nfs/dns_resolve.h		patch \| blob \| history
fs/nfs/file.c		patch \| blob \| history
fs/nfs/fscache.c		patch \| blob \| history
fs/nfs/idmap.c		patch \| blob \| history
fs/nfs/inode.c		patch \| blob \| history
fs/nfs/internal.h		patch \| blob \| history
fs/nfs/mount_clnt.c		patch \| blob \| history
fs/nfs/namespace.c		patch \| blob \| history
fs/nfs/netns.h	[new file with mode: 0644]	patch \| blob
fs/nfs/nfs2xdr.c		patch \| blob \| history
fs/nfs/nfs3acl.c		patch \| blob \| history
fs/nfs/nfs3proc.c		patch \| blob \| history
fs/nfs/nfs3xdr.c		patch \| blob \| history
fs/nfs/nfs4_fs.h		patch \| blob \| history
fs/nfs/nfs4filelayout.c		patch \| blob \| history
fs/nfs/nfs4filelayout.h		patch \| blob \| history
fs/nfs/nfs4filelayoutdev.c		patch \| blob \| history
fs/nfs/nfs4namespace.c		patch \| blob \| history
fs/nfs/nfs4proc.c		patch \| blob \| history
fs/nfs/nfs4state.c		patch \| blob \| history
fs/nfs/nfs4xdr.c		patch \| blob \| history
fs/nfs/nfsroot.c		patch \| blob \| history
fs/nfs/objlayout/objio_osd.c		patch \| blob \| history
fs/nfs/objlayout/objlayout.c		patch \| blob \| history
fs/nfs/objlayout/objlayout.h		patch \| blob \| history
fs/nfs/pagelist.c		patch \| blob \| history
fs/nfs/pnfs.c		patch \| blob \| history
fs/nfs/pnfs.h		patch \| blob \| history
fs/nfs/pnfs_dev.c		patch \| blob \| history
fs/nfs/proc.c		patch \| blob \| history
fs/nfs/read.c		patch \| blob \| history
fs/nfs/super.c		patch \| blob \| history
fs/nfs/sysctl.c		patch \| blob \| history
fs/nfs/unlink.c		patch \| blob \| history
fs/nfs/write.c		patch \| blob \| history
fs/nfsd/nfs4callback.c		patch \| blob \| history
fs/nfsd/nfs4state.c		patch \| blob \| history
fs/nfsd/nfsctl.c		patch \| blob \| history
fs/nfsd/nfssvc.c		patch \| blob \| history
fs/nfsd/stats.c		patch \| blob \| history
include/linux/key.h		patch \| blob \| history
include/linux/lockd/bind.h		patch \| blob \| history
include/linux/lockd/lockd.h		patch \| blob \| history
include/linux/lockd/xdr4.h		patch \| blob \| history
include/linux/nfs.h		patch \| blob \| history
include/linux/nfs4.h		patch \| blob \| history
include/linux/nfs_fs.h		patch \| blob \| history
include/linux/nfs_fs_i.h		patch \| blob \| history
include/linux/nfs_fs_sb.h		patch \| blob \| history
include/linux/nfs_idmap.h		patch \| blob \| history
include/linux/nfs_iostat.h		patch \| blob \| history
include/linux/nfs_page.h		patch \| blob \| history
include/linux/nfs_xdr.h		patch \| blob \| history
include/linux/sunrpc/auth.h		patch \| blob \| history
include/linux/sunrpc/bc_xprt.h		patch \| blob \| history
include/linux/sunrpc/cache.h		patch \| blob \| history
include/linux/sunrpc/clnt.h		patch \| blob \| history
include/linux/sunrpc/debug.h		patch \| blob \| history
include/linux/sunrpc/metrics.h		patch \| blob \| history
include/linux/sunrpc/rpc_pipe_fs.h		patch \| blob \| history
include/linux/sunrpc/sched.h		patch \| blob \| history
include/linux/sunrpc/stats.h		patch \| blob \| history
include/linux/sunrpc/svc.h		patch \| blob \| history
include/linux/sunrpc/svc_xprt.h		patch \| blob \| history
include/linux/sunrpc/svcauth.h		patch \| blob \| history
include/linux/sunrpc/svcauth_gss.h		patch \| blob \| history
include/linux/sunrpc/svcsock.h		patch \| blob \| history
include/linux/sunrpc/xprt.h		patch \| blob \| history
include/linux/sunrpc/xprtsock.h		patch \| blob \| history
include/trace/events/sunrpc.h	[new file with mode: 0644]	patch \| blob
net/sunrpc/Kconfig		patch \| blob \| history
net/sunrpc/addr.c		patch \| blob \| history
net/sunrpc/auth_gss/auth_gss.c		patch \| blob \| history
net/sunrpc/auth_gss/gss_krb5_crypto.c		patch \| blob \| history
net/sunrpc/auth_gss/gss_krb5_mech.c		patch \| blob \| history
net/sunrpc/auth_gss/gss_krb5_seal.c		patch \| blob \| history
net/sunrpc/auth_gss/svcauth_gss.c		patch \| blob \| history
net/sunrpc/backchannel_rqst.c		patch \| blob \| history
net/sunrpc/cache.c		patch \| blob \| history
net/sunrpc/clnt.c		patch \| blob \| history
net/sunrpc/netns.h		patch \| blob \| history
net/sunrpc/rpc_pipe.c		patch \| blob \| history
net/sunrpc/rpcb_clnt.c		patch \| blob \| history
net/sunrpc/sched.c		patch \| blob \| history
net/sunrpc/stats.c		patch \| blob \| history
net/sunrpc/sunrpc.h		patch \| blob \| history
net/sunrpc/sunrpc_syms.c		patch \| blob \| history
net/sunrpc/svc.c		patch \| blob \| history
net/sunrpc/svc_xprt.c		patch \| blob \| history
net/sunrpc/svcauth_unix.c		patch \| blob \| history
net/sunrpc/svcsock.c		patch \| blob \| history
net/sunrpc/sysctl.c		patch \| blob \| history
net/sunrpc/xprt.c		patch \| blob \| history
net/sunrpc/xprtrdma/rpc_rdma.c		patch \| blob \| history
net/sunrpc/xprtrdma/verbs.c		patch \| blob \| history
net/sunrpc/xprtsock.c		patch \| blob \| history
security/keys/key.c		patch \| blob \| history
security/keys/keyctl.c		patch \| blob \| history