摘要: Over the last couple of decades, computer architects and performance analysts have routinely attempted to profile overhead TCP/IP processing in an effort understand where time was spent. It is well understood that this a rather difficult problem since spread across various software modules such as network stack, interrupt routines, drivers, O/S scheduler, etc. As result, extracting micro-architectural characteristics significantly more challenging. In paper, we start by covering previous attempts at show what existing tools can provide terms execution characteristics. We then propose detailed methodology combines full-system simulation, cycle-accurate simulations symbol annotation rich view packet execution. discuss initial results based on our profiling This includes analysis (such instruction breakdown, CPI, MPI TLB misses state-of-the-art microprocessor).