- nvvp ./a.out

GUI(NVIDIA Visual Profiler)를 활용해 GPU 커널의 수행시간 분석


- nvcc "~~.cu"

cuda 파일을 컴파일


- nvprof ./a.out

컴파일 된 a.out파일을 nvprof라는 분석 툴을 활용해 수행시간을 세밀하게 분석한다.

이 과정에서 시간을 줄일 수 있는지 다양한 시도를 해보는 것이 중요하다.




nvprof --print-gpu-trace ./nbody --benchmark -numdevices=2 -i=1

nvprof --analysis-metrics -o nbody-analysis.nvprof ./nbody --benchmark -numdevices=2 -i=1

https://developer.nvidia.com/blog/cuda-pro-tip-nvprof-your-handy-universal-gpu-profiler/



- cuDriverGetVersion

Returns in *driverVersion the version number of the installed CUDA driver. This function automatically returns CUDA_ERROR_INVALID_VALUE if the driverVersion argument is NULL.

Parameters:
driverVersion - Returns the CUDA driver version
Returns:
CUDA_SUCCESSCUDA_ERROR_INVALID_VALUE
Note:
Note that this function may also return error codes from previous, asynchronous launches.


Returns the latest version of CUDA supported by the driver.



- cuInit

Initializes the driver API and must be called before any other function from the driver API. Currently, the Flags parameter must be 0. If cuInit() has not been called, any function from the driver API will return CUDA_ERROR_NOT_INITIALIZED.

Parameters:
Flags - Initialization flag for CUDA.
Returns:
CUDA_SUCCESSCUDA_ERROR_INVALID_VALUECUDA_ERROR_INVALID_DEVICE
Note:
Note that this function may also return error codes from previous, asynchronous launches.


- cuDeviceGet

Returns a handle to a compute device.

Parameters
device
- Returned device handle
ordinal
- Device number to get handle for



- cuDeviceGetName

CUresult cuDeviceGetName ( char* name, int  lenCUdevice dev )
Returns an identifer string for the device.
Parameters
name
- Returned identifier string for the device
len
- Maximum length of string to store in name
dev
- Device to get identifier string for
Description

Returns an ASCII string identifying the device dev in the NULL-terminated string pointed to by namelen specifies the maximum length of the string that may be returned.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDeviceGetAttributecuDeviceGetUuidcuDeviceGetLuidcuDeviceGetCountcuDeviceGetcuDeviceTotalMemcudaGetDeviceProperties


- cuDeviceTotalMem_v2

CUresult cuDeviceTotalMem ( size_t* bytesCUdevice dev )
Returns the total amount of memory on the device.
Parameters
bytes
- Returned memory available on device in bytes
dev
- Device handle
Description

Returns in *bytes the total amount of memory available on the device dev in bytes.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDeviceGetAttributecuDeviceGetCountcuDeviceGetNamecuDeviceGetUuidcuDeviceGetcudaMemGetInfo



- cuDeviceGetAttribute

CUresult cuDeviceGetAttribute ( int* piCUdevice_attribute attribCUdevice dev )
Returns information about the device.
Parameters
pi
- Returned device attribute value
attrib
- Device attribute to query
dev
- Device handle
Description

Returns in *pi the integer value of the attribute attrib on device dev. The supported attributes are:

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDeviceGetCountcuDeviceGetNamecuDeviceGetUuidcuDeviceGetcuDeviceTotalMemcudaDeviceGetAttributecudaGetDeviceProperties




- cuDeviceGetUuid

CUresult cuDeviceGetUuid ( CUuuid* uuidCUdevice dev )
Return an UUID for the device.
Parameters
uuid
- Returned UUID
dev
- Device to get identifier string for
Description

Returns 16-octets identifing the device dev in the structure pointed by the uuid.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDeviceGetAttributecuDeviceGetCountcuDeviceGetNamecuDeviceGetLuidcuDeviceGetcuDeviceTotalMemcudaGetDeviceProperties




- cuDeviceGetLuid

CUresult cuDeviceGetLuid ( char* luid, unsigned int* deviceNodeMaskCUdevice dev )
Return an LUID and device node mask for the device.
Parameters
luid
- Returned LUID
deviceNodeMask
- Returned device node mask
dev
- Device to get identifier string for
Description

Return identifying information (luid and deviceNodeMask) to allow matching device with graphics APIs.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDeviceGetAttributecuDeviceGetCountcuDeviceGetNamecuDeviceGetcuDeviceTotalMemcudaGetDeviceProperties



- cuModuleUnload

CUresult cuModuleUnload ( CUmodule hmod )
Unloads a module.
Parameters
hmod
- Module to unload
Description

Unloads a module hmod from the current context.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuModuleGetFunctioncuModuleGetGlobalcuModuleGetTexRefcuModuleLoadcuModuleLoadDatacuModuleLoadDataExcuModuleLoadFatBinary


- cuDevicePrimaryCtxRelease

CUresult cuDevicePrimaryCtxRelease ( CUdevice dev )
Release the primary context on the GPU.
Parameters
dev
- Device which primary context is released
Description

Releases the primary context interop on the device by decreasing the usage count by 1. If the usage drops to 0 the primary context of device dev will be destroyed regardless of how many threads it is current to.

Please note that unlike cuCtxDestroy() this method does not pop the context from stack in any circumstances.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

See also:

cuDevicePrimaryCtxRetaincuCtxDestroycuCtxGetApiVersioncuCtxGetCacheConfigcuCtxGetDevicecuCtxGetFlagscuCtxGetLimitcuCtxPopCurrentcuCtxPushCurrentcuCtxSetCacheConfigcuCtxSetLimitcuCtxSynchronize



------------------------------------------------

리눅스 활용!

- cd "폴더명"

폴더로 커서 위치가 진입하게 된다.


- tap

현재 입력하는 단어를 통해 만들어 낼 수 있는 파일명을 검색하여 바로 입력으로 결과를 출력!

설명이 어려운데 해보면 바로 느낌이 올 것이다.

visual studio의 tap과 매우매우매우!!! 유사하다.


- cd ..

상위 디렉토리로 이동.


- dir

현재 디렉토리에 포함되어 있는 파일을 확인




- 참조

- cuDriverGetVersion

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART____VERSION.html

http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/online/group__CUVERSION_gf83e088e9433ce2e9ce87203791dd122.html


- cuInit

http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/online/group__CUINIT_g4703189f4c7f490c73f77942a3fa8443.html


- cuDeviceGet

https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__DEVICE.html\


- cuDeviceGetName


- cuDeviceTotalMem_v2


- cuDeviceGetAttribute


- cuDeviceGetUuid


- cuDeviceGetLuid


- cuModuleUnload


- cuDevicePrimaryCtxRelease




728x90
반응형

+ Recent posts