lundi 16 mars 2015

CUDA Representation data DNA


I read a paper , it proposes represent data DNA in to CUDA such as:



Each input sequence of length L has L−l +1 l-mers. If the input sequence is represented using a character array then an l-mer requires l bytes of memory. Instead we can represent an l-mer using an integer, 2 bits for each residue [1] [15]. For example, the 4-mer CGGA can be represented using an integer whose binary representation is 01101000. By doing so, an l-mer, l ≤ 16, would need only 4 bytes and l ≤ 32 would need 8 bytes of memory. So we convert the input character array into an integer array, the integer at index i represents the l-mer starting at location i in the input sequences. By converting into input array, GPU threads only need to read one integer rather than l bytes. This would not only reduce the registry usage by also reduce the I/O time as only an integer need to be read. We use texture binding to read the input sequences.



I don't understand how to binding 01101000 to integer and save it into array interger. if l = 16 , lmax will present by 16 bit - 1 and if i convert it to integer, it is larger than 4 byte ? Can explain for me how to convert the input character array into an integer array?




Aucun commentaire:

Enregistrer un commentaire