webmagic 里面维持了一个请求队列,多个线程就是从这个队列里面请求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
 public static void main(String[] args) {
Spider spider = Spider.create(new test1()).thread(5); // 开启五个线程去请求队列里面拿请求,然后请求服务器
for(int i=0;i<10;i++){
spider.addUrl("http://localhost:8888/?a="+i); // 请求这个地址会返回参数值
}
spider.run();
}

输出结果如下:

get page: http://localhost:8888/?a=2
get page: http://localhost:8888/?a=1
get page: http://localhost:8888/?a=3
get page: http://localhost:8888/?a=0
get page: http://localhost:8888/?a=4
get page: http://localhost:8888/?a=8
get page: http://localhost:8888/?a=5
get page: http://localhost:8888/?a=9
get page: http://localhost:8888/?a=6
get page: http://localhost:8888/?a=7