問題描述
將文件讀入 4 個字節的 ByteArrays (Read file into ByteArrays of 4 bytes)
I would like to know how I could read a file into ByteArrays that are 4 bytes long. These arrays will be manipulated and then have to be converted back to a single array ready to be written to a file.
EDIT: Code snippet.
var arrays = new List<byte[]>();
using (var f = new FileStream("file.cfg.dec", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, i, 4);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
foreach (var b in arrays)
{
BitArray source = new BitArray(b);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
target[17] = source[2];
target[10] = source[3];
target[30] = source[4];
target[16] = source[5];
target[24] = source[6];
target[2] = source[7];
target[29] = source[8];
target[8] = source[9];
target[20] = source[10];
target[15] = source[11];
target[28] = source[12];
target[11] = source[13];
target[13] = source[14];
target[4] = source[15];
target[19] = source[16];
target[23] = source[17];
target[0] = source[18];
target[12] = source[19];
target[14] = source[20];
target[27] = source[21];
target[6] = source[22];
target[18] = source[23];
target[21] = source[24];
target[3] = source[25];
target[9] = source[26];
target[7] = source[27];
target[22] = source[28];
target[1] = source[29];
target[25] = source[30];
target[5] = source[31];
var back2byte = BitArrayToByteArray(target);
arrays.Clear();
arrays.Add(back2byte);
}
using (var f = new FileStream("file.cfg.enc", FileMode.Open))
{
foreach (var b in arrays)
f.Write(b, 0, b.Length);
}
EDIT 2: Here is the Ugly Betty‑looking code that accomplishes what I wanted. Now I must refine it for performance...
var arrays_ = new List<byte[]>();
var arrays_save = new List<byte[]>();
var arrays = new List<byte[]>();
using (var f = new FileStream("file.cfg.dec", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, 0, b.Length);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
foreach (var b in arrays)
{
arrays_.Add(b);
}
foreach (var b in arrays_)
{
BitArray source = new BitArray(b);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
target[17] = source[2];
target[10] = source[3];
target[30] = source[4];
target[16] = source[5];
target[24] = source[6];
target[2] = source[7];
target[29] = source[8];
target[8] = source[9];
target[20] = source[10];
target[15] = source[11];
target[28] = source[12];
target[11] = source[13];
target[13] = source[14];
target[4] = source[15];
target[19] = source[16];
target[23] = source[17];
target[0] = source[18];
target[12] = source[19];
target[14] = source[20];
target[27] = source[21];
target[6] = source[22];
target[18] = source[23];
target[21] = source[24];
target[3] = source[25];
target[9] = source[26];
target[7] = source[27];
target[22] = source[28];
target[1] = source[29];
target[25] = source[30];
target[5] = source[31];
var back2byte = BitArrayToByteArray(target);
arrays_save.Add(back2byte);
}
using (var f = new FileStream("file.cfg.enc", FileMode.Open))
{
foreach (var b in arrays_save)
f.Write(b, 0, b.Length);
}
EDIT 3: Loading a big file into byte arrays of 4 bytes wasn't the smartest idea... I have over 68 million arrays being processed and manipulated. I really wonder if its possible to load it into a single array and still have the bit manipulation work. :/
‑‑‑‑‑
參考解法
方法 1:
Here's another way, similar to @igofed's solution:
var arrays = new List<byte[]>();
using (var f = new FileStream("test.txt", FileMode.Open))
{
for (int i = 0; i < f.Length; i += 4)
{
var b = new byte[4];
var bytesRead = f.Read(b, i, 4);
if (bytesRead < 4)
{
var b2 = new byte[bytesRead];
Array.Copy(b, b2, bytesRead);
arrays.Add(b2);
}
else if (bytesRead > 0)
arrays.Add(b);
}
}
//make changes to arrays
using (var f = new FileStream("test‑out.txt", FileMode.Create))
{
foreach (var b in arrays)
f.Write(b, 0, b.Length);
}
方法 2:
Here is what you want:
using (var reader = new StreamReader("inputFileName"))
{
using (var writer = new StreamWriter("outputFileName"))
{
char[] buff = new char[4];
int readCount = 0;
while((readCount = reader.Read(buff, 0, 4)) > 0)
{
//manipulations with buff
writer.Write(buff);
}
}
}
方法 3:
IEnumerable<byte[]> arraysOf4Bytes = File
.ReadAllBytes(path)
.Select((b,i) => new{b, i})
.GroupBy(x => x.i / 4)
.Select(g => g.Select(x => x.b).ToArray())
方法 4:
Regarding your "Edit 3" ... I'll bite, although it's really a diversion from the original question.
There's no reason you need Lists of arrays, since you're just breaking up the file into a continuous list of 4‑byte sequences, looping through and processing each sequence, and then looping through and writing each sequence. You can do much better. NOTE: The implementation below does not check for or handle input files whose lengths are not exactly multiples of 4. I leave that as an exercise to you, if it is important.
To directly address your comment, here is a single‑array solution. We'll ditch the List objects, read the whole file into a single byte[] array, and then copy out 4‑byte sections of that array to do your bit transforms, then put the result back. At the end we'll just slam the whole thing into the output file.
byte[] data;
using (Stream fs = File.OpenRead("E:\\temp\\test.bmp")) {
data = new byte[fs.Length];
fs.Read(data, 0, data.Length);
}
byte[] element = new byte[4];
for (int i = 0; i < data.Length; i += 4) {
Array.Copy(data, i, element, 0, element.Length);
BitArray source = new BitArray(element);
BitArray target = new BitArray(source.Length);
target[26] = source[0];
target[31] = source[1];
// ...
target[5] = source[31];
target.CopyTo(data, i);
}
using (Stream fs = File.OpenWrite("E:\\temp\\test_out.bmp")) {
fs.Write(data, 0, data.Length);
}
All of the ugly initial read code is gone since we're just using a single byte array. Notice I reserved a single 4‑byte array before the processing loop to re‑use, so we can save the garbage collector some work. Then we loop through the giant data array 4 bytes at a time and copy them into our working array, use that to initialize the BitArrays for your transforms, and then the last statement in the block converts the BitArray back into a byte array, and copies it directly back to its original location within the giant data array. This replaces BitArrayToByteArray
method, since you did not provide it. At the end, writing is also easy since it's just slamming out the now‑transformed giant data array.
When I ran your original solution I got an OutOfMemory exception on my original test file of 100MB, so I used a 44MB file. It consumed 650MB in memory and ran in 30 seconds. The single‑array solution used 54MB of memory and ran in 10 seconds. Not a bad improvement, and it demonstrates how bad holding onto millions of small array objects is.
(by Alan Alvarez、Tim S.、igofed、spender、Justin Aquadro)