Multi-Core Computer Architecture | Week 5

Multi-Core Computer Architecture | Week 5

Course Links: https://onlinecourses.nptel.ac.in/noc23_cs113/course

Q1. In a typical GPU kernel execution, which of the following statements is/are FALSE
Threads of the same block can share data.
Data transfer from device to host memory happens after GPU kernel execute.
GPU threads can access contents of host memory directly.
Data transfer from host to device memory happens before GPU kernel execute.

Q2. Which one of the following statements is TRUE?
Switching between instruction stream is more frequent in coarse grained multithreading than in fine grained multithreading.
Hyper threading issues instruction from more than one instruction stream per slot.
Multithreading have better resource utilization than hyper threading.
Multithreading can give better throughput than hyperthreading.

Q3. Which one of the following is FALSE with respect to a superscalar processor?
CPI will be ideally less than 1.
It can support multiple instruction issue per clock cycle.
There will be multiple functional units, but only one of them can be in busy at any given pointing time.
There is operational support for fetching more than one instruction per clock cycle.

Q4. Which one of the processor execute instruction bundle created by compiler that exploited parallelism in code?
Scalar processors
VLIW processors
Speculative processors
SIMD processors

Q5. Which execution model is used in a GPU, where each thread executes the same code but on different data elements?
SIMD
SISD
MIMD
MISD

Q6. In a GPU, which one of the following statement is TRUE wrt memory coalescing?
Maximum throughput happens when threads in adjacent warps access same cache line at a time.
Maximum throughput happens when threads in a warp access same cache line at a time.
Maximum throughput happens when all threads in a warp access adjacent rows in memory at a time.
Maximum throughput happens when threads in a warp access adjacent cache lines at a time.

Q7. Consider a 1600×1000 HD display with a refresh rate 50 frames/second. It takes 50 instructions to process a pixel. A processor at 1 GHz and average IPC=1 is used to process the display. What is the minimum number of such processors required to ensure quality display streaming?

Answer: 4

Q8. Given an image A represented as 12×12 pixel matrix. An operation is done on A by a GPU that is using 2D blocks having 4 threads per block. Consider a pixel P whose blockIdx.x=3,  blockIdx.y=2, threadIdx.x=0, and threadIdx.y=1. If the image A is stored in row major format in memory from location A[0] to A[143], what is the index of P in the array?

Answer: 66

* The material and content uploaded on this website are for general information and reference purposes only !

Please do it by your own first!

DMCA.com Protection Status

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments



0
Would love your thoughts, please comment.x
()
x