不同jar包的重复类引发的maven编译失败故障解决

我在开发过程中,碰到了这样奇葩的问题,有一个类ResultInfo,出现在了3个不同的jar中,而且路径也是完全相同的,这三个jar都为工程所必须,这个问题导致maven编译失败,提示如下图所示:

事实上,3个jar包中,其中一个jar包的ResultInfo不包含方法isSuccess,而另外的是包含的,碰巧的就是maven在编译过程中,使用了这个不包含方法isSuccess的jar包,因此导致了编译错误。

网上搜索了一些资料后,大致找到了产生错误的原因:java会根据classpath设置的属性,进行类加载, 如果同名同路径的类,前面的优先加载,一旦前面的被加载,后面的就在也不会被加载。因为我使用Maven进行打包,所以我需要找到方法,来控制maven加载依赖jar的顺序。

后来在网上看到一篇文章,解释maven的依赖加载顺序,按照文章的说法,maven自2.9(也可能是2.0.9)开始,对于dependencygroupIdartifactId相同,但是version不同的jar包,加载顺序(假定为compile依赖 )如下:

  1. 依赖路径浅的优先:

    假设,A->B->slf4j:1.6.2,A->C->E->slf4j:1.4.1,则slf4j:1.6.2优先

  2. 声明优先,如果在dependencyManagement中声明的话会优先采用对应插件

  3. 覆写优先,子POM内声明的优先于父POM中的依赖

参考上述的说法,虽然不完全一致(我的场景是完全不同的jar包),我尝试将不包含方法isSuccess的jar包,放到的dependencies的最后,然后再重新调用maven编译,结果编译通过。

Scrapy配合Selenium和PhantomJS爬取动态网页

Python世界中Scrapy一直是爬虫的一个较为成熟的解决方案,目前javascript在网页中应用越来越广泛,越来越多的网站选择使用javascript动态的生成网页的内容,使得很多纯html的爬虫解决方案失效。针对这种动态网站的爬取,目前也有很多解决方案。其中Selenium+PhantomJS是较为简单和稳定的一种。

Selenium是一个网页的自动化测试工具,其本身是用python编写的。PhantomJS可以认为是一个基于WebKit内核的Headless浏览器。我们通过Selenium的Webdriver引入PhantomJS支持,使用PhantomJS来解析动态的网页。

本文以爬取建行理财数据为例:

以下是建行理财页面的截图,其中红色框标出的理财数据是后续使用js动态加载的,并不是一开始就写死在html上的。
理财页面

另外我还发现,如果仅仅使用phantomjs加载该网页,没办法自动运行js脚本,获取理财列表。通过分析该网页,我发现只要先选择区域信息,然后再加载一次页面,即可下载对应的理财列表。
理财列表

废话不多说,先上完整代码:

爬虫的代码文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# -*- coding: utf-8 -*-
import scrapy, urlparse, re
from selenium import webdriver
from scrapy.http import HtmlResponse, Request
from scrapy.loader.processors import MapCompose
from robot.items import FinanceItem
from w3lib.html import remove_tags
from datetime import datetime
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from time import sleep
from robot.db.modules import FinanceInfo
from robot.util import FinanceLoader



class CcbSpider(scrapy.Spider):
'''
中国建设银行爬虫
'''
name = "ccb"
allowed_domains = ["ccb.com"]
module = FinanceInfo
def __init__(self, *args, **kwargs):
try:
PHANTOMJS_PATH = kwargs['PHANTOMJS_PATH']
self.driver = webdriver.PhantomJS(executable_path=PHANTOMJS_PATH, service_args=["--ssl-protocol=any", "--ignore-ssl-errors=true", "--load-images=false", "--disk-cache=true"])
except Exception as e:
self.logger.error(e, exc_info=True)
exit(-2)
super(CcbSpider, self).__init__(*args, **kwargs)

@classmethod
def from_crawler(cls, crawler, *args, **kwargs):
kwargs['PHANTOMJS_PATH'] = crawler.settings['PHANTOMJS_PATH']
spider = cls(*args, **kwargs)
spider._set_crawler(crawler)
return spider

def start_requests(self):
url = 'http://finance.ccb.com/cn/finance/product.html'
self.driver.get(url)
# 点击网页中id为txt的元素
self.driver.find_element_by_id("txt").click()
wait = WebDriverWait(self.driver, 2)
# 等待class为select_hide的元素,变为可见
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'select_hide')))
# 点击id为500000的元素
self.driver.find_element_by_id("500000").click()
self.driver.get(url)
# 通过css选择器获取理财种类的tab
for element in self.driver.find_elements_by_css_selector(".life_tab>a"):
element.click()
sleep(1)
while True:
content = self.driver.page_source.encode('utf-8')
url = self.driver.current_url.encode('utf-8')
resp = HtmlResponse(url, encoding='utf-8', status=200, body=content)
div = resp.css(".insurance_tab_detail[style*='display: block']")
hrefs = div.css("td[class='list_title'] a::attr(href)").extract()
for href in hrefs:
req = Request(url=urlparse.urljoin(url, href), callback=self.parse)
req.meta['parse'] = True
yield req

if self.driver.find_element_by_id("pageDiv").is_displayed():
current, total = resp.css("#pageNum").xpath("./text()").extract()[0].split("/", 1)
if int(current) == int(total):
break
else:
self.driver.find_element_by_id("next").click()
else:
break

def parse(self, response):
self.logger.info("Start to parse the url %s \n", response.url)
self.logger.info("url: %s", response.url)
load = FinanceLoader(item=FinanceItem(), response=response)
load.add_value('updatetime', datetime.now())
load.add_css('name', "#name", MapCompose(remove_tags))
load.add_css('id', "#pdId", MapCompose(remove_tags))
load.add_value('type', u"理财")
expected_annual_return = response.css("#yieldRate2").xpath("./text()").extract()
if len(expected_annual_return) > 0:
expected_annual_return = expected_annual_return[0]
tmp = re.compile(u"\d+.\d+%").findall(expected_annual_return)
if len(tmp) == 0:
load.add_value("expected_annual_return", expected_annual_return)
else:
load.add_value("expected_annual_return", u",".join(tmp))
invest_duration = response.css("#investPeriod2").xpath("./text()").extract()
if len(invest_duration) > 0:
invest_duration = invest_duration[0]
tmp = re.compile(u"(\d+)天").findall(invest_duration)
if len(tmp) == 0:
load.add_value("invest_duration", invest_duration)
else:
load.add_value("invest_duration", u",".join(tmp))
load.add_css("currency", "#currencyType", MapCompose(remove_tags))
load.add_css("launch_area", "#saleCitys", MapCompose(remove_tags))
load.add_css("subtype", "#yieldSpec", MapCompose(remove_tags))
load.add_css("risk_level", "#riskLevel", MapCompose(remove_tags))
load.add_css("redeem", "#proMode", MapCompose(remove_tags))
detail = response.css("#instructionUrl a::attr(href)").extract()
if len(detail) > 0:
detail = detail[0]
if not detail.strip().startswith("http"):
detail = urlparse.urljoin("http://finance.ccb.com", detail)
load.add_value("detail", detail)
minimum_amount = response.css("#purFloorAmt2").xpath("./text()").extract()
if len(minimum_amount) > 0:
minimum_amount = minimum_amount[0]
try:
tmp = re.compile(u"(\d+)万").search(minimum_amount).group(1)
tmp = str(int(tmp)*10000)
except AttributeError as e:
tmp = '0'
load.add_value('minimum_amount', tmp)
start_date = response.css("#collBgnDate3").xpath("./text()").extract()
if len(start_date) > 0:
start_date = start_date[0].strip()
try:
start_date = datetime.strptime(start_date, "%Y.%m.%d %H:%M").date()
load.add_value("start_date", start_date)
except Exception as e:
pass
end_date = response.css("#collEndDate3").xpath("./text()").extract()
if len(end_date) > 0:
end_date = end_date[0].strip()
try:
end_date = datetime.strptime(end_date, "%Y.%m.%d %H:%M").date()
load.add_value("end_date", end_date)
except Exception as e:
pass
item = load.load_item()
self.logger.debug("ID: %s", load.get_value(response.css("#pdId").extract()[0], MapCompose(remove_tags))[0])
self.logger.debug("item: %s", str(item))
return item

def closed(self, reason):
self.driver.quit()

def __str__(self):
return "CcbSpider"

scrapy的配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# -*- coding: utf-8 -*-
BOT_NAME = 'robot'

SPIDER_MODULES = ['robot.spiders']
NEWSPIDER_MODULE = 'robot.spiders'

# Logging Setting
# LOG_FILE = os.path.normpath(os.path.join(os.path.dirname(os.path.dirname(os.path.abspath(__file__))), "log/spider.log"))
LOG_LEVEL = "INFO"
LOG_STDOUT = False
LOG_FORMAT = '%(asctime)s %(filename)s[line:%(lineno)d] [%(name)s] %(levelname)s: %(message)s'

# Crawl responsibly by identifying yourself (and your website) on the user-agent
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36'

# Configure a delay for requests for the same website (default: 0)
# See http://scrapy.readthedocs.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
DOWNLOAD_DELAY=1
# The download delay setting will honor only one of:
#CONCURRENT_REQUESTS_PER_DOMAIN=16
#CONCURRENT_REQUESTS_PER_IP=16

# Disable cookies (enabled by default)
COOKIES_ENABLED=True


# Enable or disable downloader middlewares
# See http://scrapy.readthedocs.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
'robot.middlewares.PhantomJSMiddleware': 1000,
}


# Configure item pipelines
# See http://scrapy.readthedocs.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
'robot.pipelines.DBPipeline': 1000,
}

PHANTOMJS_PATH = r'/root/phantomjs/bin/phantomjs'
DB_PATH = r'mysql+pymysql://robot:passw0rd@172.23.23.113:3306/robot'

代码解析

首先,在scrapy的setting.py中,加入PhantomJS的安装路径,即上述代码中的

1
2
3
4
5
6
7
8
9
10
接下来,我们来分析爬虫的代码文件,在爬虫类____init____过程中,我们需要启动selenium的Webdriver,并将其使用的浏览器设置为PhantomJS,```self.driver = webdriver.PhantomJS(executable_path=PHANTOMJS_PATH, service_args=["--ssl-protocol=any", "--ignore-ssl-errors=true", "--load-images=false", "--disk-cache=true"])```。 其中

- ```--ssl-protocol=any, --ignore-ssl-errors=true```用来设置ssl的
- ```--load-images=false```设置了让PhantomJS不加载图片,有利于提高PhantomJS的速度
- ```--disk-cache=true``` 启用本地磁盘缓存,同样有利于提高提高PhantomJS的速度

然后在start_request中

1. 先模拟点击id为txt元素,```self.driver.find_element_by_id("txt").click()```如图所示:![id为txt的元素](http://img.blog.csdn.net/20170817124919420?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvSnVsa290/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast)
2. 调用Webdriver的等待函数,等待弹框显示 ```wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'select_hide')))

  1. 在弹框中,模拟点击重庆市(id为500000),

    如图所示![重庆市](http://img.blog.csdn.net/20170817144201979?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvSnVsa290/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA
    1
    2
    3
    4
    4. 再重新获取一次页面, ```self.driver.get(url)```。
    > PS: *我猜测,上面的点击会把地理位置信息保存在PhantomJS的网页缓存中,再一次获取页面就不用再去设置地理位置信息,而通过这个信息,即可马上获取对应的理财列表了*。

    5. 遍历理财种类的tab页 ```for element in self.driver.find_elements_by_css_selector(".life_tab>a"): element.click()

    理财种类

  2. 遍历当前页面上理财列表,并且判断是否还有下一页,如果还有下一页,模拟点击下一页按钮
1
2
3
4
5
6
7
8
if self.driver.find_element_by_id("pageDiv").is_displayed():
current, total = resp.css("#pageNum").xpath("./text()").extract()[0].split("/", 1)
if int(current) == int(total):
break
else:
self.driver.find_element_by_id("next").click()
else:
break

以上就是selenium和PhantomJS配合使用的简单解释。

PS: 注意,在爬虫类中一定要加上def closed(self, reason):这个函数,并且在函数中显式退出PhantomJS,否则PhantomJS的进程会一直保留。当你使用scrapyd进行部署运行的时候,这个问题会导致只能运行几个爬虫,之后程序就卡住了。

spring-boot中jpa使用心得

小编是从python转到java的,因此小编对python世界中的sqlalchemy和django-orm的牛逼和方便记忆有心。转到java以后,发现java世界里也有类似的工具,只不过说实话,跟python相比,确实有点弱。java中,提供数据库ORM功能的工具叫做JPA。在spring中,专门有一个项目叫做spring-data-jpa用来提供对jpa的支持。我理解jpa只是一个标准,通常使用的jpa的实现是hibernate,这就是为啥默认情况下,当在pom里引入spring-data-jpa的时候,会自动引入hiberate。

废话不多说,我们先来看看spring-data-jpa是如果简化我们的开发的,请看以下代码。这段代码中,我们只需定义一个扩展自JpaRepository的接口,而在该接口中,我们只需要按照spring-data-jpa给定的规则来生成一个函数声明即可。例如该接口中,findById就等于原生的SQL :

* from data_center_info where id
1
2
3
4
5
6
7
8
9
10
11
```java
public interface DataCenterInfoDao extends JpaRepository<DataCenterInfo, Long>,
JpaSpecificationExecutor<DataCenterInfo> {
/**
* Find by id optional.
*
* @param id the id
* @return the optional
*/
Optional<DataCenterInfo> findById(Long id);
}

一开始感觉这种方式还是挺方便的,但是用着用着发现,这样的方式功能有点不健全的,例如:

  1. 无法实现join操作
  2. 只能返回
    或者 ```DataCenterInfo``` 这样的对象,如果我想返回对象中某个字段呢? 瞎了……
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
     3. 函数的名字一长,真的让人有点晕乎啊

    对于上述的问题,jpa也提供了原生的SQL方式来弥补这样的问题,例如:
    ```java
    public interface DomainRecordHistoryDao extends JpaRepository<DomainRecordHistory, Serializable> {

    List<DomainRecordHistory> findByDomainNameAndEnterpriseIdAndCreateTime(String domainName, String enterpriseId, Date date);

    @Query(value = "select distinct(create_time) from domain_record_history where domain_name = ?1 and enterprise_id=?2 order by create_time ASC ", nativeQuery = true)
    List<Date> findCreateTimeByDomainNameAndEnterpriseId(String domainName, String enterpriseId);
    }

但是,既然已经用了ORM的方式干掉SQL了,为啥我要倒退回去重写SQL,我实在不能接受原生的这种SQL写法。功夫不负有心人,翻了不少的书和网页,终于让我找到更好的方式,并且在《spring实战(第四版)》 中也有提到,书中将接下来我要提到的这种方式,称之为混合自定义的功能。
具体来说,当spring-data-jpa为Repository接口生产实现的时候,它还会查找名字与接口相同,并且添加了Impl后缀的一个类。如果这个类存在的话,spring-data-jpa将会把它的方法与spring-data-jpa所生成的方法合并在一起。对于上述

```接口而言,要查找的类名就是```DataCenterInfoDaoImpl```。
1
2
3
4
5
6
7

我首先定义了如下的接口:

```java
public interface DataCenterInfoAddition {
List<Tuple> countDataCenterInfoByArea(Province belongProvince);
}

紧接着,我修改一下之前定义的

1
2
3
4
5
6
7
8
9
10
11
```java
public interface DataCenterInfoDao extends JpaRepository<DataCenterInfo, Long>,
JpaSpecificationExecutor<DataCenterInfo>, DataCenterInfoAddition {
/**
* Find by id optional.
*
* @param id the id
* @return the optional
*/
Optional<DataCenterInfo> findById(Long id);
}

最后我定义一个

```。在这个实现中,我直接使用JPA Criteria API来实现对应的数据库功能。```countDataCenterInfoByArea```中实现的功能无法直接使用jpa定义函数名的方式来实现,这个函数返回值有两个,一个是表的**province字段以及它对应的数目**。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
```java
public class DataCenterInfoDaoImpl implements DataCenterInfoAddition {

@PersistenceContext
private EntityManager em;


@Override
public List<Tuple> countDataCenterInfoByArea(Province belongProvince) {
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Tuple> cq = cb.createQuery(Tuple.class);
Root<DataCenterInfo> nnInfo = cq.from(DataCenterInfo.class);

cq.multiselect(nnInfo.get("province"), cb.count(nnInfo)).groupBy(nnInfo.get("province"))
.orderBy(cb.desc(cb.count(nnInfo)));
if (belongProvince != null && !belongProvince.getName()
.equals(ProvinceEnum.Jituan.getName()) && !belongProvince.getName()
.equals(ProvinceEnum.Quanguo.getName())) {
cq.where(cb.equal(nnInfo.get("belongProvince"), belongProvince));
}
return em.createQuery(cq).getResultList();
}

spring-boot二进制文件下载

小编最近在Web项目中,需要完成一个excel文件导入和导出的功能。导入没有什么问题,但是导出折腾了小编半天时间。spring-boot在这里有个坑,谁踩谁知道啊!不说废话,先看spring-boot下载功能的实现代码。为了代码复用,我将上传和下载对应的功能,放在了一个抽象的Controller类中,需要该功能的controller可以直接集成这个类,并且实现两个abstract的方法即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
	abstract public class UploadAndDownloadController {

private Logger logger = LoggerFactory.getLogger(UploadAndDownloadController.class);

/**
* Download response entity.
*
* @param id the id
* @return the response entity
*/
@RequestMapping(value = "/download", method = RequestMethod.GET)
@ResponseBody
public ResponseEntity<byte[]> download(@RequestParam(value = "id", required = false) List<String> id) {
File file = generate(parseId(id));
try {
return ControllerUtil.createBytesResponse("download.xls", file);
} catch (Exception e) {
logger.error("文件下载失败", e);
return null;
} finally {
file.delete();
}
}

/**
* Upload response entity.
*
* @param file the file
* @return the response entity
*/
@RequestMapping(value = "/upload", method = RequestMethod.POST)
@ResponseBody
public ResponseEntity<JSONObject> upload(@RequestParam("file") MultipartFile file) {
if (!file.isEmpty()) {
try {
introduce(file.getInputStream());
} catch (Exception e) {
logger.error("文件导入失败", e);
return ControllerUtil.createFileUploadFailureResponse("文件导入失败");
}
}
return ControllerUtil.createFileUploadSuccessResponse();
}

/**
* Generate file.
*
* @param id the id
* @return the file
*/
protected abstract File generate(List<Long> id);

/**
* Introduce.
*
* @param in the in
*/
protected abstract void introduce(InputStream in);

protected abstract File getTemplateFile() throws IOException;

private List<Long> parseId(List<String> id) {
if (id == null) {
return null;
}
return id.stream().filter(s -> {
try {
Long.parseLong(s);
} catch (Exception e) {
return false;
}
return true;
}).map(Long::parseLong).collect(Collectors.toList());
}
}

接下来,才是最重要的一步,没有下面的设置,上面的下载功能就是废的,下载下来的二进制文件全是乱码。请看下面的代码。下面的代码扩展了WebMvcConfigurerAdapter,并添加了基于字节的http消息转换器(默认只有基于字符的http消息转换器,所以如果不加这个,下载后的二进制文件全是乱码)。

1
2
3
4
5
6
7
8
9
@Configuration
public class CustomWebAppConfigurer extends WebMvcConfigurerAdapter {
@Override
public void configureMessageConverters(List<HttpMessageConverter<?>> converters) {
//此处是重点啊,亲们
converters.add(new ByteArrayHttpMessageConverter());
super.configureMessageConverters(converters);
}
}

netty中引入spring-boot

netty是Java世界中高并发的框架,号称单台物理机能够支撑100万并发连接,是Java世界中高性能并发的不二之选。不过,跟spring-boot相比,其开发有点偏于底层,写起来没有spring-boot那么爽快,开发的效率不高。
我的一个项目中,有高并发的需求,单靠spring-boot自带的tomcat无法满足性能上的要求。因此,我选择netty,作为底层框架。为了能够提高开发效率,我尝试将spring-boot引入我的开发中。仔细想想,其实整个spring都是建立在IOC和AOP之上的,所以只要我引入spring-boot这两个最基础的组件,那么势必整个spring-boot的组件都能为我所用。

不过spring-web不晓得该咋引入,其它的组件都不成问题。不过从我的角度看,netty本身就是网络框架,基本没必要在引入一个spring-web

我的项目中使用maven做整个工程管理,以下是pom.xml,我只保留了spring-boot和netty的部分:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.test</groupId>
<artifactId>netty.spring-boot</artifactId>
<version>1.0.0-SNAPSHOT</version>
<packaging>pom</packaging>

<name>cdn-router</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
<spring.version>4.3.10.RELEASE</spring.version>
<build-tool.version>1.0.0</build-tool.version>
<cdn-opentsdb.version>1.0.0-SNAPSHOT</cdn-opentsdb.version>
</properties>
<dependencies>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.1.13.Final</version>
</dependency>
<!-- log配置:Log4j2 + Slf4j -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<!--<version>2.8.2</version>-->
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<!--<version>2.8.2</version>-->
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<!--<version>1.7.25</version>-->
</dependency>

<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<!--<version>2.8.2</version>-->
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aop</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-log4j2</artifactId>
<exclusions>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
</exclusion>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.31</version>
</dependency>

</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<!-- Import dependency management from Spring Boot -->
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-dependencies</artifactId>
<version>1.5.6.RELEASE</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<finalName>${finalName}</finalName>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<mainClass>com.test.CDNRouterServer</mainClass>
</configuration>
<executions>
<execution>
<goals>
<goal>repackage</goal>
</goals>
</execution>
</executions>
</plugin>

</plugins>
</build>
</project>

接下来,我实现一个http消息的handler,并将其设置为IOC的bean,让spring-boot去管理它。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* @Author Derek.
* @Date 2017/7/18 9:24.
*/
@Component
@Scope("prototype")
public class OpsHttpMessageHandler extends SimpleChannelInboundHandler<FullHttpRequest> {

@Override
protected void channelRead0(ChannelHandlerContext ctx, FullHttpRequest msg) throws Exception {
System.out.println("OK");
ctx.channel().writeAndFlush(Unpooled.copiedBuffer("Channel Test".getBytes("utf-8")));
}

@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception {
cause.printStackTrace();
ctx.close();
}
}

注意:其中的@Scope(“prototype”),因为netty会为每个eventloop重新生成一个handler的处理链,因此默认情况下,线程间不会共享handler。这样做的好处可以避免临界区访问的问题,从而避免了线程冲突和切换,提高并发率。而spring-boot中默认情况下,bean是单例模式的,也就是说这个bean只会有一个实例,而显然不适合与netty对handler的默认假设。因此,我们将bean改成原型模式,即@Scope(“prototype”)。在这个状态下,每次引用这个bean的时候,都会创建一个实例。当然,netty中也支持共享的handler,这时候需要在handler中注上@Sharable,此时就可以使用spring-boot默认的单例的bean了。具体见下面的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
/**
* @Author Derek.
* @Date 2017/5/18 10:58.
*/
@Component
@Sharable
public class Http2MessageHandler extends ChannelDuplexHandler {
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
System.out.println("OK");
super.channelRead(ctx, msg);
}
}

接下来,我将使用上面创建的handler创建http server,同样也设置为spring的bean。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
@Component(value = "opsHttpServer")
public class OpsHttpServer implements Runnable {
@Autowired
final private HttpProperty httpProperty = null;

@Autowired
private ServerBootstrapFactory factory;

@Autowired
final private ApplicationContext applicationContext = null;

final private EventExecutorGroup pool = new DefaultEventExecutorGroup(Runtime.getRuntime().availableProcessors() * 2);

private static final int MAX_CONTENT_LENGTH = 1024 * 100;


@Override
public void run() {
ServerBootstrap tcpBootStrap = factory.newServerBootstrap(0);
tcpBootStrap.handler(new LoggingHandler(LogLevel.INFO))
.childHandler(new ChannelInitializer<SocketChannel>() { // (4)
@Override
public void initChannel(SocketChannel ch) throws Exception {
ch.pipeline().addLast(new HttpServerCodec())
.addLast(new HttpContentDecompressor())
.addLast(new ChunkedWriteHandler())
.addLast(new HttpContentCompressor())
.addLast(new HttpObjectAggregator(MAX_CONTENT_LENGTH))
.addLast(pool, applicationContext.getBean("opsHttpMessageHandler", OpsHttpMessageHandler.class));
}
});
try {
ChannelFuture cf = tcpBootStrap.bind(httpProperty.getBackport()).sync();
cf.channel().closeFuture().sync();
} catch (InterruptedException e) {
logger.error(e.getMessage(), e);
} finally {
factory.shutdownGracefully(false);
}
}
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
@Component
@Scope("prototype")
public class ServerBootstrapFactory {
private EventLoopGroup bossGroup;
private EventLoopGroup workerGroup;

/**
* New server bootstrap server bootstrap.
*
* @param ioThreadCount the io thread count
* @return the server bootstrap
*/
public ServerBootstrap newServerBootstrap(int ioThreadCount) {
if (Epoll.isAvailable()) {
return newEpollServerBootstrap(ioThreadCount);
}

return newNioServerBootstrap(ioThreadCount);
}

/**
* Shutdown gracefully.
*
* @param shouldWait the should wait
*/
public void shutdownGracefully(boolean shouldWait) {
Future<?> workerFuture = workerGroup.shutdownGracefully();
Future<?> bossFuture = bossGroup.shutdownGracefully();

if (shouldWait) {
workerFuture.awaitUninterruptibly();
bossFuture.awaitUninterruptibly();
}
}

private ServerBootstrap newNioServerBootstrap(int ioThreadCount) {
if (ioThreadCount > 0) {
bossGroup = new NioEventLoopGroup(ioThreadCount);
workerGroup = new NioEventLoopGroup(ioThreadCount);
} else {
bossGroup = new NioEventLoopGroup();
workerGroup = new NioEventLoopGroup();
}

return new ServerBootstrap().group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class);
}

private ServerBootstrap newEpollServerBootstrap(int ioThreadCount) {
if (ioThreadCount > 0) {
bossGroup = new EpollEventLoopGroup(ioThreadCount);
workerGroup = new EpollEventLoopGroup(ioThreadCount);
} else {
bossGroup = new EpollEventLoopGroup();
workerGroup = new EpollEventLoopGroup();
}

return new ServerBootstrap().group(bossGroup, workerGroup)
.channel(EpollServerSocketChannel.class);
}
}

请注意这一句代码,addLast(pool, applicationContext.getBean(“opsHttpMessageHandler”, OpsHttpMessageHandler.class))。我使用applicationContext来获取opsHttpMessageHandler这个bean,因为此处位于ChannelInitializer中,而ChannelInitializer本身并不是spring管理的bean,所以只能通过applicationContext来获取对应的bean。

上述代码中,我将server设置为线程,这样是为了能在一个程序中,同时监听多个端口。最后是整个程序的入口,使用ApplicationContext来获取server的bean,并使用ExecutorService 来启动server对应的线程。

1
2
3
4
5
6
7
8
9
10
11
12
13
@SpringBootApplication
public class CDNRouterServer {

private static Logger logger = LoggerFactory.getLogger(CDNRouterServer.class);

public static void main(String[] args) throws Exception {
ApplicationContext ctx = SpringApplication.run(CDNRouterServer.class, args);
ExecutorService service = Executors.newCachedThreadPool();
OpsHttpServer server = ctx.getBean("opsHttpServer", OpsHttpServer.class);
service.execute(server);
service.shutdown();
}
}

引入Spring-boot后,我们就可以很方便的引入spring-data-jpa来做数据库的访问了,而不需要再手动得写JDBC的程序,极大简化了数据库的访问。

ps: 小编是从python路转java的,念念不忘python世界中的sqlalchemy和Django中的ORM,实在是无法忍受jdbc写sql语句。得亏还有个jpa,虽然我个人认为jpa绝对无法跟sqlalchemy相比,但是有胜于无嘛,况且写起来也还是挺方便的。虽然,总觉得它缺胳膊少腿的……