Some Observations on HPC in China

China support various research and development in CPU and OS

On the OS side it is taking the open source Linux direction and try to add the Chinese requirements

On the CPU side it also take open source and license road so far there are at least three different directions

1) Loongson-龙芯 (Godson, old name),

  • start out as cooperation between Institute of Computing Technology (ICT), Chinese Academy of Science (CAS) and BLX IC Design Corporation.龙梦 (Lemote Technology)
  • STMicroelectronics fabricates the Loongson(Godson) chip, 1,2 and 3A,3B, it hold some marketing right for godson for few years, now only do the fabrication
  • the chief architect is Dr Hu Weiwu
  • Currently Loongson.Technology Corporation limited , founded in 2008, designer of Loongson CPU and provide the reference solution and sale the Loongson CPU
  • Report Loongson Technology is a joint venture between CAS institute of Computing and Tianjin Municipal Gov.
  • It hold license from MIPS
  • 龙梦 (Lemote Technology) 中科梦兰 will concentrate on the notebook, Desktop motherboard
  • 曙光 Dawning Information Industry will concentrate on blade and servers and other



Loongson3B is the first 64bit Octal-core CPU of Loongson. Dominant frequency is 1GHz. The peak floating point performance is 128 GFLOPS. Loongson3B is one low power and high performance processor It suitable for high performance computers, high performance servers, digital signal processing etc..






Due to the license form MIPS, there ware questions whether Loongson chip is home made or not

But Loongson did made many improvement over the MIPS64:

  • add more core support

  • add HT

  • add DDR2 DIMM

  • add PCI-E2

  • add AVX

  • add HW emulation of x86 ISA

Loongson-3C is target 28nm, will see who will make this chip, one possibility is TSMC

rumor that Darwing 6000 will use Loonson-3B

There were few Hop Chip presentations on :

GS564V: low power XPU with 512-Bit Vector Extension



(SW 1600) is the 3rd generation CPU, seems to follow the DEC Alpha 21164

Otherwise very little detail is know

there are some high level diagram in this link


the CPU part translation



  • The 16 core consist of 4 group of 4 core that are connected by 5×5 xbar @ 1.1 Ghz

  • each core support 4flops

  • 2 socket=140.8 Gflops

  • support DDR3 DIMM: 8 GB/socket, 16GB per 2 socket

  • PCI-E 2

We donot know the detail of each core but the HPC cluster  installed is list in 14 position of the, Nov/2011

SUNWAY Blue Light MPP 神威蓝光


So this is the real things

3)FT-1000 飞腾
      • 1Ghz, 8 core, 8 threads,
      • openSPARC T2
      • 3HT links
      • 4 DDR3 memory channel
      • 8 PCI2.0)
      • 65nm,

First appear in 2010

Tianhe-1A – NUDT YH MPP, Xeon X5670 6C 2.93 GHz, NVIDIA 2050


You can see that in 2011 11/01 list the FT-1000 was dropped and Rmax did not change

We all know that it is not easy to run linpack with mixture of different CPU, but in 2010 it is good show for the “home made” CPU in the 1st rank of Top500

Recent comments on china-hpc

National People’s Congress Deputy Hu Weiwu, who is the chief developer of the Loongson series of microchips at the Chinese Academy of Sciences (CAS), told reporters on Saturday that the “Dawning 6000” supercomputer, jointly developed by the Institute of Computing Technology of CAS and the Dawning Information Industry Company (DIIC), will adopt Loongson microchips for the first time as its core component. It will have a computing speed of more than 1,000 trillion operations a second.
“Our information industry was using foreign technology. However, just like a country’s industry cannot always depend on foreign steel and oil, China’s information industry needs its own CPU (central processing unit),” Hu said.
The supercomputer developed by CAS and DIIC is scheduled to be available as early as this summer.
Making supercomputers with Chinese microchips is one of the nation’s major science and technology projects. Three organizations – the Institute of Computing Technology of CAS, Jiangnan Institute of Computing Technology and the National University of Defense Technology (NUDT) – have their own supercomputer projects.
According to their schedules, all three institutions will need to meet the target of using domestically developed microchips by the end of this year.
Hu said the new supercomputer will use fewer than 10,000 Loongson microchips, and will also be more energy-efficient.
Tianhe-1A, developed by NUDT in Hunan’s provincial capital Changsha, is the fastest supercomputer in the world. However, Tianhe-1A largely runs on 14,336 CPUs made by Intel, and 7,186 GPUs (graphics processing units) from NVidia, two US chip-makers.
Hu said there will be difficulties ahead as there are few applications developed for these supercomputers. “We have enough supercomputers in China but still can’t fully utilize them,” He said.
Supercomputers can be used on national defense projects as well as scientific projects in geology, meteorology and medicine. Due to the lack of software engineers for supercomputers, there are few applications available in China.
“There are lots of scientific questions waiting for answers from supercomputer simulation and calculation. But we still need good algorithm and good data collection to make it work,” Hu explained.
“Each year the electricity bill could cost more than 10 million yuan ($1.5 million) for one supercomputer, and we are only using one tenth of its capacity at most,” Hu said.
Hu added that although the China-made CPUs have improved since they were first produced in 2002, they have a long way to go to compete with US chip-makers such as Intel.
“It still needs another decade before China-made chips meet the needs of the domestic market. Hopefully after two decades, we will be able to sell our China-made CPUs to the US just like we are selling clothes and shoes,” Hu said.



  • There are various talks on the Hot chip conference on godson-T and godson with vector processor but there are only at simulator level and not really the production level
  • The multi-core architecture of both Loongson and SW are all based on box-style communication between cores and switch between core group it remain to be seen the it can scale beyond 16 cores
  • In order to move the china-hpc to the next level, beside the software Eng. effort is finding the foundry for 32 /28 /22 nm, TSMC is the nature source but it remain to be seen that two side will come out a agreement

About laotsao 老曹

HopBit GridComputing LLC Rockscluster Gridengine Solaris Zone, Solaris Cluster, OVM SPARC/Ldom Exadata, SPARC SuperCluster
This entry was posted in china chip, china-hpc, HPC, Loongson. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s