人工智能与金融

class: center, middle, inverse, title-slide

# 人工智能与金融
### 吴燕丰
### 江西财大，金融学院
### 2021/04/13

---

### 人工智能

传统软件 VS 人工智能

传统软件：是「基于规则」的，需要人为的设定条件，并且告诉计算机符合这个条件后该做什么。

.footnote[
备注（本页及后续引自）： [人工智能 – Artificial intelligence | AI](https://easyai.tech/ai-definition/ai/)
]

---

### 人工智能

传统软件 VS 人工智能

机器从「特定的」大量数据中总结规律，归纳出某些「特定的知识」，然后将这种「知识」应用到现实场景中去解决实际问题。

---

### 人工智能是一种工具

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-tool.png)

---

### 人工智能只解决特定问题

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-danrenwu.png)

---

### 知其然，但不知所以然

并不关心为什么

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-guina.png)

---

### 还不是那么智能

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2019-12-12-lizi.png)

---

### 人工智能的发展历史

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2019-07-25-085054.jpg)

---

### 第一次浪潮（非智能对话机器人）

20世纪50年代到60年代

1950年10月，图灵提出了人工智能（AI）的概念，同时提出了图灵测试来测试 AI。

图灵测试提出没有几年，人们就看到了计算机通过图灵测试的“曙光”。

1966年，心理治疗机器人 ELIZA 诞生

那个年代的人对他评价很高，有些病人甚至喜欢跟机器人聊天。但是他的实现逻辑非常简单，就是一个有限的对话库，当病人说出某个关键词时，机器人就回复特定的话。

**第一次浪潮并没有使用什么全新的技术，而是用一些技巧让计算机看上去像是真人，计算机本身并没有智能。**

---

### 第二次浪潮（语音识别）

20世纪80年代到90年代

在第二次浪潮中，语音识别是最具代表性的几项突破之一。核心突破原因就是放弃了符号学派的思路，改为了统计思路解决实际问题。

在《人工智能》一书中，李开复详细介绍了这个过程，他也是参与其中的重要人物之一。

**第二次浪潮最大的突破是改变了思路，摒弃了符号学派的思路，转而使用了统计学思路解决问题。**

---

### 第三次浪潮（深度学习+大数据）

21世纪初

2006年是深度学习发展史的分水岭。杰弗里辛顿在这一年发表了**《一种深度置信网络的快速学习算法》**，其他重要的深度学习学术文章也在这一年被发布，在基本理论层面取得了若干重大突破。

之所以第三次浪潮会来主要是2个条件已经成熟：

- 2000年后互联网行业飞速发展形成了海量数据。同时数据存储的成本也快速下降。使得海量数据的存储和分析成为了可能。

- GPU 的不断成熟提供了必要的算力支持，提高了算法的可用性，降低了算力的成本。

在各种条件成熟后，深度学习发挥出了强大的能力。在语音识别、图像识别、NLP等领域不断刷新纪录。让 AI 产品真正达到了可用（例如语音识别的错误率只有6%，人脸识别的准确率超过人类，BERT在11项表现中超过人类…)的阶段。

???
BERT: Bidirectional Encoder Representations from Transformers, 基于变换器的双向编码器表示技术。

---

### 第三次浪潮（深度学习+大数据）

**第三次浪潮来袭，主要是因为大数据和算力条件具备，这样深度学习可以发挥出巨大的威力，并且 AI 的表现已经超越人类，可以达到“可用”的阶段，而不只是科学研究。**

![](https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-dl-3rd.png)

.footnote[
备注：以上人工智能内容皆引用自 [人工智能 – Artificial intelligence | AI](https://easyai.tech/ai-definition/ai/)
]

---
class:middle,center

### 故事结束！

### 工具开始！

---

### Python与机器学习

[scikit-learn: Machine Learning in Python](https://scikit-learn.org/stable/index.html)

1. [安装Anaconda](https://www.anaconda.com/)

2. [安装scikit-learn](https://scikit-learn.org/stable/install.html#installation-instructions)

```bash
pip install -U scikit-learn
```

Getting Started:

https://scikit-learn.org/stable/getting_started.html

---

### 分类和回归

---

### 聚类和降维

.pull-left[
<img src='clustering.png'>
]

.pull-right[
<img src='dimensionality_reduction.png'>
]

---

### 模型选择和预处理

---

### 练习Recognizing hand-written digits

链接：[Recognizing hand-written digits](https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-auto-examples-classification-plot-digits-classification-py)

---

### 代码

```python
# Author: Gael Varoquaux <gael dot varoquaux at normalesup dot org>
# License: BSD 3 clause

# Standard scientific Python imports
import matplotlib.pyplot as plt

# Import datasets, classifiers and performance metrics
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split
```

#### 导入Digits dataset

```python
digits = datasets.load_digits()

_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for ax, image, label in zip(axes, digits.images, digits.target):
    ax.set_axis_off()
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title('Training: %i' % label)
```

---

### Classification

```python
# flatten the images
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

# Create a classifier: a support vector classifier
clf = svm.SVC(gamma=0.001)

# Split data into 50% train and 50% test subsets
X_train, X_test, y_train, y_test = train_test_split(
    data, digits.target, test_size=0.5, shuffle=False)

# Learn the digits on the train subset
clf.fit(X_train, y_train)

# Predict the value of the digit on the test subset
predicted = clf.predict(X_test)
```

---

### Classification

```python
_, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3))
for ax, image, prediction in zip(axes, X_test, predicted):
    ax.set_axis_off()
    image = image.reshape(8, 8)
    ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
    ax.set_title(f'Prediction: {prediction}')
```

```python
print(f"Classification report for classifier {clf}:\n"
      f"{metrics.classification_report(y_test, predicted)}\n")

disp = metrics.plot_confusion_matrix(clf, X_test, y_test)
disp.figure_.suptitle("Confusion Matrix")
print(f"Confusion matrix:\n{disp.confusion_matrix}")

plt.show()
```

- [Download Python source code: plot_digits_classification.py](https://scikit-learn.org/stable/_downloads/1a55101a8e49ab5d3213dadb31332045/plot_digits_classification.py)

- [下载plot_digits_classification.ipynb](https://scikit-learn.org/stable/_downloads/eb87d6211b2c0a7c2dc460a9e28b1f6a/plot_digits_classification.ipynb)

---

### nbviewer

https://nbviewer.jupyter.org/

---

### 金融行业人工智能应用场景

---

### 人工智能在金融中的应用

- robo-advisor(智能投顾)

- 算法交易

- 欺诈检测

- 借贷/保险承销

- 客户服务

- 情感/新闻报道分析

- 金融产品销售及推荐

- 随你所能。。。

---

### 行业AI实战手册讲解 - 金融篇

[英特尔中国金融行业 AI 实战手册](https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/ai-guidebook-fsi.html)

---

### 利用供应链网络预测金融风险

[Financial Risk Analysis for SMEs with Graph-based Supply Chain Mining](https://www.ijcai.org/proceedings/2020/0643.pdf)

![](https://pic2.zhimg.com/v2-ed237751e6a1e0c45bcf1cf155b69a60_1440w.jpg?source=172ae18b)

.footnote[
中文介绍：https://zhuanlan.zhihu.com/p/348060075
]

---

### 文献

- Zhang, Y. L., Zhou, J., Zheng, W., Feng, J., Li, L., Liu, Z., ... & Zhou, Z. H. (2019). Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Transactions on Intelligent Systems and Technology (TIST), 10(5), 1-19. [download](https://arxiv.org/pdf/1805.04234.pdf)

- Hu, B., Zhang, Z., Zhou, J., Fang, J., Jia, Q., Fang, Y., ... & Qi, Y. (2020, October). Loan Default Analysis with Multiplex Graph Learning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 2525-2532). [download](https://arxiv.org/ftp/arxiv/papers/2104/2104.02479.pdf)

- Agyapong, D. (2020). Analyzing financial risks in small and medium enterprises: evidence from the food processing firms in selected cities in Ghana. International Journal of Entrepreneurial Behavior & Research.

- Feng, Y., Hu, B., Lv, F., Liu, Q., Zhang, Z., & Ou, W. (2020, July). ATBRG: Adaptive Target-Behavior Relational Graph Network for Effective Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2231-2240). [download](https://arxiv.org/pdf/2005.12002.pdf)

---

### 文献

- Zhong, Q., Liu, Y., Ao, X., Hu, B., Feng, J., Tang, J., & He, Q. (2020, April). Financial defaulter detection on online credit payment via multi-view attributed heterogeneous information network. In Proceedings of The Web Conference 2020 (pp. 785-795).

- Li, Z., Zhang, J., Yao, X., & Kou, G. (2021). How to identify early defaults in online lending: A cost-sensitive multi-layer learning framework. Knowledge-Based Systems, 106963.

- Chi, J., Zeng, G., Zhong, Q., Liang, T., Feng, J., Xiang, A., & Tang, J. (2020, November). Learning to Undersampling for Class Imbalanced Credit Risk Forecasting. In 2020 IEEE International Conference on Data Mining (ICDM) (pp. 72-81). IEEE.

- Weng, Y., Chen, L., & Chen, X. (2020). Identifying User Relationship on WeChat Money-Gifting Network. IEEE Transactions on Knowledge and Data Engineering.

---

- Liang, T., Zeng, G., Zhong, Q., Chi, J., Feng, J., Ao, X., & Tang, J. (2021, March). Credit Risk and Limits Forecasting in E-Commerce Consumer Lending Service via Multi-view-aware Mixture-of-experts Nets. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (pp. 229-237).

- Liu, Y., Ao, X., Zhong, Q., Feng, J., Tang, J., & He, Q. (2020, October). Alike and Unlike: Resolving Class Imbalance Problem in Financial Credit Risk Assessment. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 2125-2128).

- Liu, C., Zhong, Q., Ao, X., Sun, L., Lin, W., Feng, J., ... & Tang, J. (2020, August). Fraud Transactions Detection via Behavior Tree with Local Intention Calibration. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3035-3043). [download](https://www.researchgate.net/profile/Wangli-Lin-2/publication/343783230_Fraud_Transactions_Detection_via_Behavior_Tree_with_Local_Intention_Calibration/links/5fae2022299bf18c5b707bfb/Fraud-Transactions-Detection-via-Behavior-Tree-with-Local-Intention-Calibration.pdf)

- Hu, B., Zhang, Z., Shi, C., Zhou, J., Li, X., & Qi, Y. (2019, July). Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 946-953). [download](https://www.researchgate.net/profile/Binbin-Hu-6/publication/330278847_Cash-out_User_Detection_based_on_Attributed_Heterogeneous_Information_Network_with_a_Hierarchical_Attention_Mechanism/links/5c36d7c492851c22a368d030/Cash-out-User-Detection-based-on-Attributed-Heterogeneous-Information-Network-with-a-Hierarchical-Attention-Mechanism.pdf?_sg%5B0%5D=yr6_JPITuPSsQ8ZaJpzGFW-dq4c0Nf7DzfxWS5vMxPoSYebGnvR1W6M2Xe2Po5spQge5ErZCzfIK0tsKbONLnw.G-fW4Xqz2B0_pfHRiCul0_RLQYs2je5ESTHcmCLjg30e33lzkdr8EZ0g4YbNacBbXSFS5o6YAV8h4NrOYw5DzQ&_sg%5B1%5D=vzecGeThKssIy59g8FUN8Ojej-GO6WrA-ZCUvFaBwkzDoBXMvaBACyhKxDx3e7We-bU1f2i3yQZdYg7TKr8X9momqYIRTiEJdV2YVagqBtQ6.G-fW4Xqz2B0_pfHRiCul0_RLQYs2je5ESTHcmCLjg30e33lzkdr8EZ0g4YbNacBbXSFS5o6YAV8h4NrOYw5DzQ&_iepl=)

- Liu, Z., Wang, D., Yu, Q., Zhang, Z., Shen, Y., Ma, J., ... & Qi, Y. (2019, November). Graph representation learning for merchant incentive optimization in mobile payment marketing. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 2577-2584).

---

- Lin, J., Zhang, Z., Zhou, J., Li, X., Fang, J., Fang, Y., ... & Qi, Y. (2018, December). NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 1960-1965). IEEE.

- Zhang, Z., Chen, C., Zhou, J., & Li, X. (2018, May). An industrial-scale system for heterogeneous information card ranking in alipay. In International Conference on Database Systems for Advanced Applications (pp. 713-724). Springer, Cham.

- Shi, C., Zhang, Z., Luo, P., Yu, P. S., Yue, Y., & Wu, B. (2015, October). Semantic path based personalized recommendation on weighted heterogeneous information networks. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 453-462). cites: 168 times

- He, B., Zhang, Z., Liu, J., Zhuang, F., & Shi, C. (2015, July). Repeat Buyers Prediction after Sales Promotion for Tmall Platform. In Proceedings of the 1st International Workshop on Social Influence Analysis co-located with 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, Argentina (Vol. 27).

---

### workshop and 资料

- [FinNLP-2021](https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp2021/home)

- [Deep Learning for Finance: data & papers](https://github.com/sangyx/deep-finance)

- [MATLAB: 人工智能在金融领域的应用](https://ww2.mathworks.cn/discovery/ai-in-finance.html)

- [2019年人工智能技术在中国金融行业的应用概览](http://qccdata.qichacha.com/ReportData/PDF/7a1add86e60f78f07994e770cc3bc9a8.pdf)

- [AI在金融领域的应用](https://developer.aliyun.com/article/177941)

- [德勤：金融服务新格局](https://www2.deloitte.com/content/dam/Deloitte/cn/Documents/financial-services/deloitte-cn-fs-how-artificial-intelligence-is-transforming-the-financial-ecosystem-zh-181123.pdf)

.footnote[
- [人工智能在金融科技领域有哪些应用？](https://www.zhihu.com/question/57409852)  
- [Deep Finance: 金融科技相关的学术论文](https://www.zhihu.com/column/c_1336229676193095680)  
- https://zhuanlan.zhihu.com/p/133647008
]