ARM上的 Windows 10 IoT 企業(yè)版支持仿真 x86 應(yīng)用程序,而 ARM上的 Windows 11 IoT 企業(yè)版則支持仿真 x86 和 x64 應(yīng)用程序。英創(chuàng)推出的名片尺寸ARM64工控主板ESM8400,可預(yù)裝正版Windows 10 IoT企業(yè)版操作系統(tǒng),x86程序可無(wú)需修改而直接在ESM8400上運(yùn)行。
下會(huì)將編寫(xiě)一個(gè)小程序,分別構(gòu)建成x86和ARM64格式來(lái)測(cè)試其運(yùn)行效率。所設(shè)計(jì)的測(cè)試程序代碼如下,其中的TestSmp函數(shù)有兩個(gè)輸入?yún)?shù),第一參數(shù)表示要?jiǎng)?chuàng)建測(cè)試線程的數(shù)量,第二個(gè)參數(shù)為所創(chuàng)建線程的運(yùn)行時(shí)長(zhǎng)。cbTestSmp是被創(chuàng)建的測(cè)試線程,測(cè)試線程主要是在一個(gè)while循環(huán)中,反復(fù)讀取內(nèi)存變量然后與預(yù)設(shè)值進(jìn)行比較,在運(yùn)行設(shè)定的時(shí)間后自動(dòng)退出循環(huán),其中的threadParam->loops變量會(huì)記錄下while循環(huán)總共執(zhí)行的次數(shù)。
typedef struct _SMP_THREAD_PARAM { UINT32 durationMs; UINT32 cpuId; UINT64 loops; BOOL bSetAffinity; UINT32 sandBoxSize; LPVOID sandBoxStart; }SMP_THREAD_PARAM, * PSMP_THREAD_PARAM; DWORD WINAPI cbTestSmp(LPVOID param) { PSMP_THREAD_PARAM threadParam = (PSMP_THREAD_PARAM)param; DWORD tStart = GetTickCount(); UINT8* buffer = (UINT8*)threadParam->sandBoxStart; wprintf(L"Ahou, Thread %d, running for %d ms\r\n", threadParam->cpuId, threadParam->durationMs); // Write to sandbox for (UINT32 i = 0; i < threadParam->sandBoxSize; i++) { buffer[i] = (UINT8)(i);// * (UINT32)threadParam->loops); } while ((GetTickCount() - tStart) < threadParam->durationMs) { // Read back from sandbox for (UINT32 i = 0; i < threadParam->sandBoxSize; i++) { //if (buffer[i] != (UINT8)(i * (UINT32)threadParam->loops) ) if (buffer[i] != (UINT8)(i))// * (UINT32)threadParam->loops) ) { wprintf(L"Thread %d : error at byte %d for loop %I64d !!\r\n", threadParam->cpuId, i, threadParam->loops); } } threadParam->loops++; } wprintf(L"Thread %d : terminating\r\n", threadParam->cpuId); return 0; } void TestSmp(UINT32 nCpus, UINT32 durationMs) { UINT32 i; PSMP_THREAD_PARAM threadParams; HANDLE* threadHandles; UINT64 totalLoops = 0; UINT32 sandBoxSize = 1024 * 128; // 128 kB HANDLE h_array[1]; threadParams = (PSMP_THREAD_PARAM)malloc(nCpus * sizeof(SMP_THREAD_PARAM)); if (threadParams == NULL) { wprintf(L"Failed allocating thread params !\r\n"); return; } threadHandles = (HANDLE*)malloc(nCpus * sizeof(HANDLE)); if (threadHandles == NULL) { wprintf(L"Failed allocating thread handles !\r\n"); return; } for (i = 0; i < nCpus; i++) { threadParams[i].bSetAffinity = TRUE; threadParams[i].cpuId = i; threadParams[i].durationMs = durationMs; threadParams[i].loops = 0; threadParams[i].sandBoxSize = sandBoxSize; threadParams[i].sandBoxStart = malloc(sandBoxSize); threadHandles[i] = CreateThread(NULL, 0, cbTestSmp, &threadParams[i], 0, NULL); wprintf(L"Thread handle %d : 0x%x\r\n", i, threadHandles[i]); } h_array[0] = threadHandles[0]; DWORD res = WaitForSingleObject(h_array[0], INFINITE); Sleep(500); if (res == WAIT_TIMEOUT) { wprintf(L"Timeout waiting for threads !\r\n"); } else { wprintf(L"All threads exited\r\n"); } for (i = 0; i < nCpus; i++) { wprintf(L"Thread %d did run %I64d loops\r\n", i, threadParams[i].loops); totalLoops += threadParams[i].loops; free(threadParams[i].sandBoxStart); CloseHandle(threadHandles[i]); } wprintf(L"Total number of loops %I64d (%I64d millions)\r\n", totalLoops, totalLoops / 1000000); free(threadHandles); free(threadParams); }
將上述代碼分別編譯構(gòu)建成x86格式和ARM64模式,設(shè)置while循環(huán)執(zhí)行10000ms,在ESM8400上的測(cè)試結(jié)果如下:
ESM8400 Win10 ARM工控主板運(yùn)行x86和ARM64程序效率對(duì)比
可以看到相同的代碼,構(gòu)建成本機(jī)ARM64格式的運(yùn)行效率是x86格式的2.2倍以上。
基于微軟系統(tǒng)以及其開(kāi)發(fā)工具良好的兼容性,我很容易做了另一個(gè)對(duì)比實(shí)驗(yàn),將上述代碼不經(jīng)修改直接在VS2008中編譯成WEC7應(yīng)用程序,在英創(chuàng)的幾款WEC7工控主板上做了同樣的測(cè)試,測(cè)試結(jié)果如下:
ESM3354是英創(chuàng)10年前推出的第一款預(yù)裝WEC7的工控主板,主CPU采用了TI的單核Cortex-A8芯片——AM3354,ESM3354目前仍在批量供貨。而安裝Windows 10 IoT的ESM8400工控主板,主CPU為NXP的i.MX8M Plus四核Cortex-A53,與10年前的ESM3354相比,ESM8400的性能有超過(guò)10倍的提升。
ARM上的 Windows IoT 企業(yè)版可以讓習(xí)慣使用 x86/x64 的設(shè)備開(kāi)發(fā)人員快速進(jìn)行軟件開(kāi)發(fā),大多數(shù)適用于 Windows IoT 企業(yè)版的文檔都適用于 ARM64 和 x86/x64。通過(guò)仿真技術(shù),ARM上的 Windows IoT可按原樣運(yùn)行x86/x64程序而無(wú)需修改,而直接構(gòu)建本機(jī)ARM64應(yīng)用程序能獲得最佳的性能、響應(yīng)能力和能耗。
成都英創(chuàng)信息技術(shù)有限公司 028-8618 0660