class: center, middle, inverse, title-slide # 人工智能与金融 ### 吴燕丰 ### 江西财大,金融学院 ### 2021/04/13 --- ### 人工智能 传统软件 VS 人工智能 <img src='https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-chauntong.png'> 传统软件:是「基于规则」的,需要人为的设定条件,并且告诉计算机符合这个条件后该做什么。 .footnote[ 备注(本页及后续引自): [人工智能 – Artificial intelligence | AI](https://easyai.tech/ai-definition/ai/) ] --- ### 人工智能 传统软件 VS 人工智能 <img src='https://easy-ai.oss-cn-shanghai.aliyuncs.com/2020-01-02-ailuoji.png'> 机器从「特定的」大量数据中总结规律,归纳出某些「特定的知识」,然后将这种「知识」应用到现实场景中去解决实际问题。 --- ### 人工智能是一种工具  --- ### 人工智能只解决特定问题  --- ### 知其然,但不知所以然 并不关心为什么  --- ### 还不是那么智能  --- ### 人工智能的发展历史  --- ### 第一次浪潮(非智能对话机器人) 20世纪50年代到60年代 1950年10月,图灵提出了人工智能(AI)的概念,同时提出了图灵测试来测试 AI。 图灵测试提出没有几年,人们就看到了计算机通过图灵测试的“曙光”。 1966年,心理治疗机器人 ELIZA 诞生 那个年代的人对他评价很高,有些病人甚至喜欢跟机器人聊天。但是他的实现逻辑非常简单,就是一个有限的对话库,当病人说出某个关键词时,机器人就回复特定的话。 **第一次浪潮并没有使用什么全新的技术,而是用一些技巧让计算机看上去像是真人,计算机本身并没有智能。** --- ### 第二次浪潮(语音识别) 20世纪80年代到90年代 在第二次浪潮中,语音识别是最具代表性的几项突破之一。核心突破原因就是放弃了符号学派的思路,改为了统计思路解决实际问题。 在《人工智能》一书中,李开复详细介绍了这个过程,他也是参与其中的重要人物之一。 **第二次浪潮最大的突破是改变了思路,摒弃了符号学派的思路,转而使用了统计学思路解决问题。** --- ### 第三次浪潮(深度学习+大数据) 21世纪初 2006年是深度学习发展史的分水岭。杰弗里辛顿在这一年发表了**《一种深度置信网络的快速学习算法》**,其他重要的深度学习学术文章也在这一年被发布,在基本理论层面取得了若干重大突破。 -- 之所以第三次浪潮会来主要是2个条件已经成熟: - 2000年后互联网行业飞速发展形成了海量数据。同时数据存储的成本也快速下降。使得海量数据的存储和分析成为了可能。 - GPU 的不断成熟提供了必要的算力支持,提高了算法的可用性,降低了算力的成本。 -- 在各种条件成熟后,深度学习发挥出了强大的能力。在语音识别、图像识别、NLP等领域不断刷新纪录。让 AI 产品真正达到了可用(例如语音识别的错误率只有6%,人脸识别的准确率超过人类,BERT在11项表现中超过人类…)的阶段。 ??? BERT: Bidirectional Encoder Representations from Transformers, 基于变换器的双向编码器表示技术。 --- ### 第三次浪潮(深度学习+大数据) **第三次浪潮来袭,主要是因为大数据和算力条件具备,这样深度学习可以发挥出巨大的威力,并且 AI 的表现已经超越人类,可以达到“可用”的阶段,而不只是科学研究。**  .footnote[ 备注:以上人工智能内容皆引用自 [人工智能 – Artificial intelligence | AI](https://easyai.tech/ai-definition/ai/) ] --- class:middle,center ### 故事结束! -- ### 工具开始! --- ### Python与机器学习 [scikit-learn: Machine Learning in Python](https://scikit-learn.org/stable/index.html) 1. [安装Anaconda](https://www.anaconda.com/) 2. [安装scikit-learn](https://scikit-learn.org/stable/install.html#installation-instructions) ```bash pip install -U scikit-learn ``` Getting Started: https://scikit-learn.org/stable/getting_started.html --- ### 分类和回归 <img src='classification_regression.png'> --- ### 聚类和降维 .pull-left[ <img src='clustering.png'> ] .pull-right[ <img src='dimensionality_reduction.png'> ] --- ### 模型选择和预处理 <img src='model_selection_preprocessing.png'> --- ### 练习Recognizing hand-written digits <img src='https://scikit-learn.org/stable/_images/sphx_glr_plot_digits_classification_001.png'> 链接:[Recognizing hand-written digits](https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-auto-examples-classification-plot-digits-classification-py) --- ### 代码 ```python # Author: Gael Varoquaux <gael dot varoquaux at normalesup dot org> # License: BSD 3 clause # Standard scientific Python imports import matplotlib.pyplot as plt # Import datasets, classifiers and performance metrics from sklearn import datasets, svm, metrics from sklearn.model_selection import train_test_split ``` #### 导入Digits dataset ```python digits = datasets.load_digits() _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3)) for ax, image, label in zip(axes, digits.images, digits.target): ax.set_axis_off() ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest') ax.set_title('Training: %i' % label) ``` --- ### Classification ```python # flatten the images n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1)) # Create a classifier: a support vector classifier clf = svm.SVC(gamma=0.001) # Split data into 50% train and 50% test subsets X_train, X_test, y_train, y_test = train_test_split( data, digits.target, test_size=0.5, shuffle=False) # Learn the digits on the train subset clf.fit(X_train, y_train) # Predict the value of the digit on the test subset predicted = clf.predict(X_test) ``` --- ### Classification ```python _, axes = plt.subplots(nrows=1, ncols=4, figsize=(10, 3)) for ax, image, prediction in zip(axes, X_test, predicted): ax.set_axis_off() image = image.reshape(8, 8) ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest') ax.set_title(f'Prediction: {prediction}') ``` ```python print(f"Classification report for classifier {clf}:\n" f"{metrics.classification_report(y_test, predicted)}\n") disp = metrics.plot_confusion_matrix(clf, X_test, y_test) disp.figure_.suptitle("Confusion Matrix") print(f"Confusion matrix:\n{disp.confusion_matrix}") plt.show() ``` - [Download Python source code: plot_digits_classification.py](https://scikit-learn.org/stable/_downloads/1a55101a8e49ab5d3213dadb31332045/plot_digits_classification.py) - [下载plot_digits_classification.ipynb](https://scikit-learn.org/stable/_downloads/eb87d6211b2c0a7c2dc460a9e28b1f6a/plot_digits_classification.ipynb) --- ### nbviewer https://nbviewer.jupyter.org/ <iframe width=800 height=400 src='https://nbviewer.jupyter.org/'> </iframe> --- ### 金融行业人工智能应用场景 <img width='100%' src='金融行业AI应用场景.png'> --- ### 人工智能在金融中的应用 - robo-advisor(智能投顾) - 算法交易 - 欺诈检测 - 借贷/保险承销 - 客户服务 - 情感/新闻报道分析 - 金融产品销售及推荐 - 随你所能。。。 --- ### 行业AI实战手册讲解 - 金融篇 [英特尔中国金融行业 AI 实战手册](https://www.intel.cn/content/www/cn/zh/analytics/artificial-intelligence/ai-guidebook-fsi.html) <iframe src='//players.brightcove.net/2379864814001/default_default/index.html?videoId=6161538953001' allowfullscreen frameborder=0></iframe> --- ### 利用供应链网络预测金融风险 [Financial Risk Analysis for SMEs with Graph-based Supply Chain Mining](https://www.ijcai.org/proceedings/2020/0643.pdf)  .footnote[ 中文介绍:https://zhuanlan.zhihu.com/p/348060075 ] --- ### 文献 - Zhang, Y. L., Zhou, J., Zheng, W., Feng, J., Li, L., Liu, Z., ... & Zhou, Z. H. (2019). Distributed deep forest and its application to automatic detection of cash-out fraud. ACM Transactions on Intelligent Systems and Technology (TIST), 10(5), 1-19. [download](https://arxiv.org/pdf/1805.04234.pdf) - Hu, B., Zhang, Z., Zhou, J., Fang, J., Jia, Q., Fang, Y., ... & Qi, Y. (2020, October). Loan Default Analysis with Multiplex Graph Learning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 2525-2532). [download](https://arxiv.org/ftp/arxiv/papers/2104/2104.02479.pdf) - Agyapong, D. (2020). Analyzing financial risks in small and medium enterprises: evidence from the food processing firms in selected cities in Ghana. International Journal of Entrepreneurial Behavior & Research. - Feng, Y., Hu, B., Lv, F., Liu, Q., Zhang, Z., & Ou, W. (2020, July). ATBRG: Adaptive Target-Behavior Relational Graph Network for Effective Recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2231-2240). [download](https://arxiv.org/pdf/2005.12002.pdf) --- ### 文献 - Zhong, Q., Liu, Y., Ao, X., Hu, B., Feng, J., Tang, J., & He, Q. (2020, April). Financial defaulter detection on online credit payment via multi-view attributed heterogeneous information network. In Proceedings of The Web Conference 2020 (pp. 785-795). - Li, Z., Zhang, J., Yao, X., & Kou, G. (2021). How to identify early defaults in online lending: A cost-sensitive multi-layer learning framework. Knowledge-Based Systems, 106963. - Chi, J., Zeng, G., Zhong, Q., Liang, T., Feng, J., Xiang, A., & Tang, J. (2020, November). Learning to Undersampling for Class Imbalanced Credit Risk Forecasting. In 2020 IEEE International Conference on Data Mining (ICDM) (pp. 72-81). IEEE. - Weng, Y., Chen, L., & Chen, X. (2020). Identifying User Relationship on WeChat Money-Gifting Network. IEEE Transactions on Knowledge and Data Engineering. --- - Liang, T., Zeng, G., Zhong, Q., Chi, J., Feng, J., Ao, X., & Tang, J. (2021, March). Credit Risk and Limits Forecasting in E-Commerce Consumer Lending Service via Multi-view-aware Mixture-of-experts Nets. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (pp. 229-237). - Liu, Y., Ao, X., Zhong, Q., Feng, J., Tang, J., & He, Q. (2020, October). Alike and Unlike: Resolving Class Imbalance Problem in Financial Credit Risk Assessment. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (pp. 2125-2128). - Liu, C., Zhong, Q., Ao, X., Sun, L., Lin, W., Feng, J., ... & Tang, J. (2020, August). Fraud Transactions Detection via Behavior Tree with Local Intention Calibration. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 3035-3043). [download](https://www.researchgate.net/profile/Wangli-Lin-2/publication/343783230_Fraud_Transactions_Detection_via_Behavior_Tree_with_Local_Intention_Calibration/links/5fae2022299bf18c5b707bfb/Fraud-Transactions-Detection-via-Behavior-Tree-with-Local-Intention-Calibration.pdf) - Hu, B., Zhang, Z., Shi, C., Zhou, J., Li, X., & Qi, Y. (2019, July). Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 946-953). [download](https://www.researchgate.net/profile/Binbin-Hu-6/publication/330278847_Cash-out_User_Detection_based_on_Attributed_Heterogeneous_Information_Network_with_a_Hierarchical_Attention_Mechanism/links/5c36d7c492851c22a368d030/Cash-out-User-Detection-based-on-Attributed-Heterogeneous-Information-Network-with-a-Hierarchical-Attention-Mechanism.pdf?_sg%5B0%5D=yr6_JPITuPSsQ8ZaJpzGFW-dq4c0Nf7DzfxWS5vMxPoSYebGnvR1W6M2Xe2Po5spQge5ErZCzfIK0tsKbONLnw.G-fW4Xqz2B0_pfHRiCul0_RLQYs2je5ESTHcmCLjg30e33lzkdr8EZ0g4YbNacBbXSFS5o6YAV8h4NrOYw5DzQ&_sg%5B1%5D=vzecGeThKssIy59g8FUN8Ojej-GO6WrA-ZCUvFaBwkzDoBXMvaBACyhKxDx3e7We-bU1f2i3yQZdYg7TKr8X9momqYIRTiEJdV2YVagqBtQ6.G-fW4Xqz2B0_pfHRiCul0_RLQYs2je5ESTHcmCLjg30e33lzkdr8EZ0g4YbNacBbXSFS5o6YAV8h4NrOYw5DzQ&_iepl=) - Liu, Z., Wang, D., Yu, Q., Zhang, Z., Shen, Y., Ma, J., ... & Qi, Y. (2019, November). Graph representation learning for merchant incentive optimization in mobile payment marketing. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (pp. 2577-2584). --- - Lin, J., Zhang, Z., Zhou, J., Li, X., Fang, J., Fang, Y., ... & Qi, Y. (2018, December). NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay. In 2018 IEEE International Conference on Big Data (Big Data) (pp. 1960-1965). IEEE. - Zhang, Z., Chen, C., Zhou, J., & Li, X. (2018, May). An industrial-scale system for heterogeneous information card ranking in alipay. In International Conference on Database Systems for Advanced Applications (pp. 713-724). Springer, Cham. - Shi, C., Zhang, Z., Luo, P., Yu, P. S., Yue, Y., & Wu, B. (2015, October). Semantic path based personalized recommendation on weighted heterogeneous information networks. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 453-462). cites: 168 times - He, B., Zhang, Z., Liu, J., Zhuang, F., & Shi, C. (2015, July). Repeat Buyers Prediction after Sales Promotion for Tmall Platform. In Proceedings of the 1st International Workshop on Social Influence Analysis co-located with 24th International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, Argentina (Vol. 27). --- ### workshop and 资料 - [FinNLP-2021](https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp2021/home) - [Deep Learning for Finance: data & papers](https://github.com/sangyx/deep-finance) - [MATLAB: 人工智能在金融领域的应用](https://ww2.mathworks.cn/discovery/ai-in-finance.html) - [2019年人工智能技术在中国金融行业的应用概览](http://qccdata.qichacha.com/ReportData/PDF/7a1add86e60f78f07994e770cc3bc9a8.pdf) - [AI在金融领域的应用](https://developer.aliyun.com/article/177941) - [德勤:金融服务新格局](https://www2.deloitte.com/content/dam/Deloitte/cn/Documents/financial-services/deloitte-cn-fs-how-artificial-intelligence-is-transforming-the-financial-ecosystem-zh-181123.pdf) .footnote[ - [人工智能在金融科技领域有哪些应用?](https://www.zhihu.com/question/57409852) - [Deep Finance: 金融科技相关的学术论文](https://www.zhihu.com/column/c_1336229676193095680) - https://zhuanlan.zhihu.com/p/133647008 ]