qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Questions about networking


From: Peter Niessen
Subject: [Qemu-devel] Questions about networking
Date: Tue, 3 Aug 2010 10:13:26 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100520 SUSE/3.0.5 Thunderbird/3.0.5

Dear List,

I'm trying to set up a testbed for batch systems using qemu-kvm. So far,
I've created two machines, a master ("torque") and an execution host
("mom") for use with torque. I'm using the following command lines to
start up the virtual machines:

qemu-kvm -smp 2 -m 768 -hda ./torque.qcow2 -net
nic,vlan=1,macaddr=52:54:00:12:34:56 -net
nic,vlan=2,macaddr=52:54:00:12:34:57 -net user,vlan=2 -net
socket,vlan=1,listen=localhost:1234 -redir tcp:26022::22 -nographic
-daemonize

qemu-kvm -smp 2 -m 768 -hda ./mom.qcow2 -net
nic,vlan=1,macaddr=52:54:00:12:34:58 -net
socket,vlan=1,connect=localhost:1234 -nographic -daemonize

which I took from http://www.h7.dion.ne.jp/~qemu-win/HowToNetwork-en.html.

Everything works fine, I can see the internet from "mom" via "torque"
and NFS mount the users home directory from "torque" on "mom" and
resolve users via NIS.

Here's the ifconfig of the nodes:

torque:~ # ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56
          inet addr:192.168.42.250  Bcast:192.168.42.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe12:3456/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:707 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1873 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:44388 (43.3 Kb)  TX bytes:2539091 (2.4 Mb)
          Interrupt:11 Base address:0x2000

eth1      Link encap:Ethernet  HWaddr 52:54:00:12:34:57
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe12:3457/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:69 errors:0 dropped:0 overruns:0 frame:0
          TX packets:88 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7837 (7.6 Kb)  TX bytes:13548 (13.2 Kb)
          Interrupt:10 Base address:0xc000

And "mom":

mom:~ # ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:58
          inet addr:192.168.42.1  Bcast:192.168.42.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe12:3458/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1888 errors:0 dropped:0 overruns:0 frame:0
          TX packets:752 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2514373 (2.3 Mb)  TX bytes:60325 (58.9 Kb)
          Interrupt:11 Base address:0x2000


The ping times between the servers are the following:

torque:~ # ping mom
PING mom.qemu (192.168.42.1) 56(84) bytes of data.
64 bytes from mom.qemu (192.168.42.1): icmp_seq=1 ttl=64 time=39.6 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=2 ttl=64 time=39.4 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=3 ttl=64 time=39.7 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=4 ttl=64 time=39.8 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=5 ttl=64 time=39.8 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=6 ttl=64 time=39.8 ms
64 bytes from mom.qemu (192.168.42.1): icmp_seq=7 ttl=64 time=39.8 ms

Do these times make sense?

However, batch operations are not working properly. Jobs start fine and
produce the right output, but when it comes to tidying up, the "mom"
machine can't contact the "torque":

Aug  3 10:10:26 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:27 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:28 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:29 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:29 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:29 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:30 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:31 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:32 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:33 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused
Aug  3 10:10:34 mom pbs_mom: LOG_ERROR::Operation now in progress (115)
in scan_for_exiting, cannot connect to port 1023 in client_to_svr -
connection refused


At this time, tcpdump on the "torque" machine says:


10:10:17.072582 IP mom.qemu.1023 > torque.qemu.pbs: Flags [S], seq
25915729, win 5840, options [mss 1460,sackOK,TS val 719328 ecr
0,nop,wscale 6], length 0
10:10:17.072647 IP torque.qemu.pbs > mom.qemu.1023: Flags [S.], seq
18959859, ack 25915730, win 5792, options [mss 1460,sackOK,TS val 756722
ecr 719328,nop,wscale 6], length 0
10:10:17.152568 IP mom.qemu.1023 > torque.qemu.pbs: Flags [R], seq
25915730, win 0, length 0
10:10:18.084234 IP mom.qemu.1023 > torque.qemu.pbs: Flags [S], seq
41724490, win 5840, options [mss 1460,sackOK,TS val 720340 ecr
0,nop,wscale 6], length 0
10:10:18.084297 IP torque.qemu.pbs > mom.qemu.1023: Flags [S.], seq
34766899, ack 41724491, win 5792, options [mss 1460,sackOK,TS val 757734
ecr 720340,nop,wscale 6], length 0
10:10:18.163568 IP mom.qemu.1023 > torque.qemu.pbs: Flags [R], seq
41724491, win 0, length 0
10:10:19.095909 IP mom.qemu.1023 > torque.qemu.pbs: Flags [S], seq
57533379, win 5840, options [mss 1460,sackOK,TS val 721352 ecr
0,nop,wscale 6], length 0
10:10:19.095947 IP torque.qemu.pbs > mom.qemu.1023: Flags [S.], seq
50574033, ack 57533380, win 5792, options [mss 1460,sackOK,TS val 758745
ecr 721352,nop,wscale 6], length 0
10:10:19.175628 IP mom.qemu.1023 > torque.qemu.pbs: Flags [R], seq
57533380, win 0, length 0

netstat says:

torque:~ # netstat | grep 1023
tcp        0      0 torque.qemu:1023        mom.qemu:pbs_mom
TIME_WAIT
tcp        0      0 torque.qemu:1023        mom.qemu:pbs_mom
TIME_WAIT

Might the performance of my internal network conection (192.168.42.0/24)
not be sufficient?

Thanks for your help,

Cheers, Peter.

------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]