--- trunk/uoms-doc/doc.tex 2010/04/14 07:39:18 5 +++ trunk/uoms-doc/doc.tex 2010/04/23 09:49:46 7 @@ -21,9 +21,9 @@ taboada@udc.es\\ +\section{Acknowledgments} - - +This work was funded by Hewlett-Packard Spain and partially supported by the Ministry of Science and Innovation of Spain under Project TIN2007-67537-C03-02 and by the Galician Government (Xunta de Galicia, Spain) under the Consolidation Program of Competitive Research Groups (Ref. 3/2006 DOGA 12/13/2006). We gratefully thank Brian Wibecan for his comments and for share with us his thoughts and knowledge. Also, we thank Jim Bovay for his support, and CESGA, for providing access to the FinisTerrae supercomputer. \section{Files in this benchmarking suite} @@ -91,6 +91,12 @@ \item \texttt{upc\_memput} (local) \item \texttt{memcpy} (local) \item \texttt{memmove} (local) +\item \texttt{upc\_memcpy\_async} (remote) +\item \texttt{upc\_memget\_async} (remote) +\item \texttt{upc\_memput\_async} (remote) +\item \texttt{upc\_memcpy\_async} (local) +\item \texttt{upc\_memget\_async} (local) +\item \texttt{upc\_memput\_async} (local) \item \texttt{upc\_memcpy\_asynci} (remote) \item \texttt{upc\_memget\_asynci} (remote) \item \texttt{upc\_memput\_asynci} (remote) @@ -113,6 +119,7 @@ \begin{itemize} \item \texttt{NUMCORES}: If defined it will override the detection of the number of cores. If not defined the number of cores is set through the \texttt{sysconf(\_SC\_NPROCESSORS\_ONLN)} system call. \item \texttt{ASYNC\_MEM\_TEST}: If defined asynchronous memory transfer tests will be built. Default is defined. +\item \texttt{ASYNCI\_MEM\_TEST}: If defined asynchronous memory transfer with implicit handlers tests will be built. Default is defined. \item \texttt{MINSIZE}: The minimum message size to be used in the benchmarking. Default is 4 bytes. \item \texttt{MAXSIZE}: The maximum message size to be used in the benchmarking. Default is 16 megabytes. \end{itemize} @@ -123,7 +130,7 @@ \begin{itemize} \item \texttt{-help}: Print usage information and exits. \item \texttt{-version}: Print UOMS version and exits. -\item \texttt{-off\_cache}: Enable cache invalidation. Be aware that the cache invalidation greatly increases the memory consumption. Also, note that for block sizes smaller than the cache line size it will not work. +\item \texttt{-off\_cache}: Enable cache invalidation. Be aware that the cache invalidation greatly increases the memory consumption. Also, note that for block sizes smaller than the cache line size it will not have any effect. \item \texttt{-warmup}: Enable a warmup iteration. \item \texttt{-reduce\_op OP}: Choose the reduce operation to be performed by \texttt{upc\_all\_reduceD} and \texttt{upc\_all\_prefix\_reduceD}. Valid operations are: \begin{itemize} @@ -198,6 +205,12 @@ \item \texttt{upc\_all\_prefix\_reduceD} \item \texttt{upc\_all\_reduceLD} \item \texttt{upc\_all\_prefix\_reduceLD} +\item \texttt{upc\_memget\_async} +\item \texttt{upc\_memput\_async} +\item \texttt{upc\_memcpy\_async} +\item \texttt{local\_upc\_memget\_async} +\item \texttt{local\_upc\_memput\_async} +\item \texttt{local\_upc\_memcpy\_async} \item \texttt{upc\_memget\_asynci} \item \texttt{upc\_memput\_asynci} \item \texttt{upc\_memcpy\_asynci}