Jump to content

Process Affinity Mask


Pavlos Parissis

Recommended Posts

Hi,

 

I am using Process Affinity Mask to bond a CPU core with a specific PBX. so I can have on different CPU cores 2 PBX.

I noticed that even I specifically set CPU core, PBX has few threads running on different CPU core

 

with <affinity_mask>01</affinity_mask>

 

I have PBX on 1st CPU CORE

[root@node-01 proc]# taskset -p 2483

pid 2483's current affinity mask: 1

[root@node-01 proc]# ps --pid 2483 -o psr,pid

PSR PID

0 2483 ## ps counts from 0 so 0 is the 1st CPU core

 

 

proc fs confirms that for the parent process

[root@node-01 proc]# cat 2483/stat

2483 (pbx_01) S 1 2472 9434 34816 9434 4194624 2687 0 0 0 7 34 0 0 25 0 11 0 315039 135675904 6084 4294967295 134512640 138931033 3219518768 3219515736 15430658 0 0 16781313 0 4294967295 0 0 17 0 0 0 0

 

 

but there are 4 threads running on the 2nd CPU core

 

[root@node-01 proc]# cat 2483/task/24

2483/ 2485/ 2486/ 2487/ 2488/ 2489/ 2490/ 2491/ 2492/ 2493/ 2494/

[root@node-01 proc]# cat 2483/task/24*/stat

2483 (pbx_01) S 1 2472 9434 34816 9434 4194624 2576 0 0 0 7 7 0 0 25 0 11 0 315039 135675904 6085 4294967295 134512640 138931033 3219518768 3219515736 15430658 0 0 16781313 0 0 0 0 17 0 0 0 0

2485 (pbx_01) S 1 2472 9434 34816 9434 4194624 7 0 0 0 0 0 0 0 -2 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3085655544 15430658 0 0 16781313 0 0 0 0 -1 0 1 2 0

2486 (pbx_01) S 1 2472 9434 34816 9434 4194368 9 0 0 0 0 27 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3075231672 15430658 0 0 16781313 0 0 0 0 -1 1 0 0 0

2487 (pbx_01) S 1 2472 9434 34816 9434 4194368 71 0 0 0 0 0 0 0 18 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3064742576 15430658 0 0 16781313 0 0 0 0 -1 0 0 0 0

2488 (pbx_01) S 1 2472 9434 34816 9434 4194368 12 0 0 0 0 0 0 0 15 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3054252736 15430658 0 0 16781313 0 0 0 0 -1 0 0 0 0

2489 (pbx_01) S 1 2472 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3043755344 15430658 0 0 16781313 0 0 0 0 -1 1 0 0 0

2490 (pbx_01) S 1 2472 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3033266160 15430658 0 0 16781313 0 0 0 0 -1 1 0 0 0

2491 (pbx_01) S 1 2472 9434 34816 9434 4194368 2 0 0 0 0 0 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3022782724 15430658 0 0 16781313 0 0 0 0 -1 0 0 0 0

2492 (pbx_01) S 1 2472 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3012292400 15430658 0 0 16781313 0 0 0 0 -1 0 0 0 0

2493 (pbx_01) S 1 2472 9434 34816 9434 4194368 1 0 0 0 0 0 0 0 20 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 3001802976 15430658 0 0 16781313 0 0 0 0 -1 1 0 0 0

2494 (pbx_01) S 1 2472 9434 34816 9434 4194368 1 0 0 0 0 0 0 0 25 0 11 0 315060 135675904 6085 4294967295 134512640 138931033 3219518768 2991313404 15430658 0 0 16781313 0 0 0 0 -1 0 0 0 0

 

 

when I have

<affinity_mask>02</affinity_mask>

 

[root@node-01 proc]# taskset -p 7888

pid 7888's current affinity mask: 2

[root@node-01 proc]# ps --pid 7888 -o psr,pid

PSR PID

1 7888

 

 

only 2 threads is running on the other CPU core

[root@node-01 proc]# cat 7888/stat

7888 (pbx_01) S 1 7864 9434 34816 9434 4194624 2686 0 0 0 8 25 0 0 25 0 11 0 354251 135675904 6083 4294967295 134512640 138931033 3214896976 3214893944 3728386 0 0 16781313 0 4294967295 0 0 17 1 0 0 0

 

[root@node-01 proc]# cat 7888/task/78*/stat

7888 (pbx_01) S 1 7864 9434 34816 9434 4194624 2573 0 0 0 8 12 0 0 25 0 11 0 354251 135675904 6083 4294967295 134512640 138931033 3214896976 3214893944 3728386 0 0 16781313 0 0 0 0 17 1 0 0 0

7890 (pbx_01) S 1 7864 9434 34816 9434 4194624 9 0 0 0 0 0 0 0 -2 0 11 0 354280 135675904 6083 4294967295 134512640 138931033 3214896976 3084991992 3728386 0 0 16781313 0 0 0 0 -1 0 1 2 0

7891 (pbx_01) S 1 7864 9434 34816 9434 4194368 9 0 0 0 0 12 0 0 25 0 11 0 354280 135675904 6083 4294967295 134512640 138931033 3214896976 3074568120 3728386 0 0 16781313 0 0 0 0 -1 0 0 0 0

7892 (pbx_01) S 1 7864 9434 34816 9434 4194368 69 0 0 0 0 0 0 0 23 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3064079024 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7893 (pbx_01) S 1 7864 9434 34816 9434 4194368 12 0 0 0 0 0 0 0 18 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3053589184 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7894 (pbx_01) S 1 7864 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3043091792 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7895 (pbx_01) S 1 7864 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3032602608 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7896 (pbx_01) S 1 7864 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3022119172 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7897 (pbx_01) S 1 7864 9434 34816 9434 4194368 3 0 0 0 0 0 0 0 25 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3011628848 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7898 (pbx_01) S 1 7864 9434 34816 9434 4194368 1 0 0 0 0 0 0 0 23 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 3001139424 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

7899 (pbx_01) S 1 7864 9434 34816 9434 4194368 1 0 0 0 0 0 0 0 25 0 11 0 354281 135675904 6083 4294967295 134512640 138931033 3214896976 2990649852 3728386 0 0 16781313 0 0 0 0 -1 1 0 0 0

 

why do I see this difference on CPU core allocation?

I was expecting to see all threads on 1 CPU core.

 

 

[root@node-01 pbxnsip]# ./pbx_01 --version

Version: 3.4.0.3201

[root@node-01 pbxnsip]# cat /etc/redhat-release

CentOS release 5.4 (Final)

Link to comment
Share on other sites

now I see something very strange

with <affinity_mask>01</affinity_mask>

 

 

 

[root@node-01 proc]# taskset -p 22756

pid 22756's current affinity mask: 1

 

[root@node-01 proc]# ps --pid 22756 -o psr,pid

PSR PID

0 22756

 

 

 

[root@node-01 proc]# cat 22756/stat

22756 (pbx_01) S 1 22751 3176 0 -1 4194624 2689 0 0 0 9 19 0 0 25 0 11 0 463780 135675904 6086 4294967295 134512640 138931033 3216870704 3216867672 10126338 0 0 16789507 0 4294967295 0 0 17 0 0 0 0

 

several threads on the other core

 

 

[root@node-01 proc]# cat 22756/task/22*/stat

22756 (pbx_01) S 1 22751 3176 0 -1 4194624 2571 0 0 0 9 7 0 0 25 0 11 0 463780 135675904 6083 4294967295 134512640 138931033 3216870704 3216867672 10126338 0 0 16789507 0 0 0 0 17 0 0 0 0

22758 (pbx_01) S 1 22751 3176 0 -1 4194624 9 0 0 0 0 0 0 0 -2 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3085393400 10126338 0 0 16789507 0 0 0 0 -1 0 1 2 0

22759 (pbx_01) S 1 22751 3176 0 -1 4194368 11 0 0 0 0 11 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3074969528 10126338 0 0 16789507 0 0 0 0 -1 0 0 0 0

22760 (pbx_01) S 1 22751 3176 0 -1 4194368 69 0 0 0 0 0 0 0 23 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3064480432 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22761 (pbx_01) S 1 22751 3176 0 -1 4194368 12 0 0 0 0 0 0 0 18 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3053990592 10126338 0 0 16789507 0 0 0 0 -1 0 0 0 0

22762 (pbx_01) S 1 22751 3176 0 -1 4194368 3 0 0 0 0 0 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3043493200 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22763 (pbx_01) S 1 22751 3176 0 -1 4194368 3 0 0 0 0 0 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3033004016 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22764 (pbx_01) S 1 22751 3176 0 -1 4194368 3 0 0 0 0 0 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3022520580 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22765 (pbx_01) S 1 22751 3176 0 -1 4194368 3 0 0 0 0 0 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3012030256 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22766 (pbx_01) S 1 22751 3176 0 -1 4194368 1 0 0 0 0 0 0 0 23 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 3001540832 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

22767 (pbx_01) S 1 22751 3176 0 -1 4194368 1 0 0 0 0 0 0 0 25 0 11 0 463804 135675904 6083 4294967295 134512640 138931033 3216870704 2991051260 10126338 0 0 16789507 0 0 0 0 -1 1 0 0 0

Link to comment
Share on other sites

In Linux the behavior is different than in Windows. The intention behind settings the affinity is to avoid RTP jitter during the time whne the OS moves the thread between the cores. In Linux, this is only relevant for the RTP thread, in Windows the whole process must be bound to one core. If you want to set the affinity in Linux, you should do this from outside of the process using the standard Linux command line tools (see taskset).

Link to comment
Share on other sites

In Linux the behavior is different than in Windows. The intention behind settings the affinity is to avoid RTP jitter during the time whne the OS moves the thread between the cores. In Linux, this is only relevant for the RTP thread, in Windows the whole process must be bound to one core. If you want to set the affinity in Linux, you should do this from outside of the process using the standard Linux command line tools (see taskset).

 

Initially I used taskset to control CPU core bonding and it worked very well. Since this functionality is offered by PBX, I said to use it and avoid extra changes on init script. I thought, it is a bit strange to not use application setting for controlling the application itself.

 

 

Nevertheless, I will do a bit more testing with taskset and post the results here.

If the result of the testing is the expected one[1], I will follow your advice to use taskset. Shall I assume no issues will pop up with that?

 

But, you have to admit Process Affinity on linux is not working as expected.

 

Thanks

 

[1] all threads on the same core.

Link to comment
Share on other sites

Well usually it is a feature if the OS can swap other tasks to other threads to more cores, the only concern from the PBX perspective is jitter on the RTP playout. I consider it a feature that it locks only this one thread down. But for sure it is a matter of documentation, and this forum thread should help a lot understanding it!

Link to comment
Share on other sites

I tested the usage of taskset and it works as expected, all threads are running on the cpu core set by taskset.

I am going to follow your advice and use taskset to bond a pbx to a specific core since the setting from pbx doesn't work as expected

 

I hope with will not give us in issues when actually traffic is being processed.

 

Furthermore, I modified the init script mention at LSB compliant init script thread

 

and here is the code with a small change for using taskset to force process affinity.

#!/bin/bash
#
### BEGIN INIT INFO
# Provides: pbx_01
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Default-Start:   3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: start and stop pbx_01
# Description: Init script fro pbxnsip.
### END INIT INFO

# source function library
. /etc/init.d/functions

RETVAL=0

# Installation location
CPU="02"
INSTALLDIR=/pbx_service_01/pbxnsip
PBX_CONFIG=$INSTALLDIR/pbx.xml
PBX=pbx_01
PID_FILE=/var/run/$PBX.pid
LOCK_FILE=/var/lock/subsys/$PBX
PBX_OPTIONS="--dir $INSTALLDIR --config $PBX_CONFIG --pidfile $PID_FILE"


# Disable the below when pbx is under pacemaker Cluster resource manager
# Pacemaker doesn't like exit 5
#[ -x $INSTALLDIR/$PBX ] || exit 5

start()
{
       echo -n "Starting PBX: "
       if [ -n "${CPU+x}" ] && [ ! -z $CPU ]; then
          daemon --pidfile $PID_FILE taskset $CPU $INSTALLDIR/$PBX $PBX_OPTIONS
       else
          daemon --pidfile $PID_FILE $INSTALLDIR/$PBX $PBX_OPTIONS
       fi

       RETVAL=$?
       echo
       [ $RETVAL -eq 0 ] && touch $LOCK_FILE
       return $RETVAL

}
stop()
{
       echo -n "Stopping PBX: "
       killproc -p $PID_FILE $PBX
       RETVAL=$?
       echo
       [ $RETVAL -eq 0 ] && rm -f $LOCK_FILE
       return $RETVAL
}

case "$1" in
       start)
               start
               ;;
       stop)
               stop
               ;;
       restart)
               stop
               start
               ;;
       force-reload)
               stop
               start
               ;;
       status)
               status -p $PID_FILE $PBX
               RETVAL=$?
               ;;
       *)
               echo $"Usage: $0 {start|stop|restart|force-reload|status}"
               exit 2
esac

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...