读取用户输入¶

目前所接触到的程序，都是点击 运行 之后，就可以双手离开键盘，等它运行完毕，给出结果。

但是，我们用到的大部分程序，都是包含与用户的互动的。
比如，我们玩的手游，会根据用户的不同操作给出不同反馈。
再比如，我们使用搜索引擎，它会读取我们输入的文字，然后在整个互联网查询相关信息。

鼠标点击、滑动屏幕、文字输入等等都是常见的输入方式。
在未来，可能会有语音、体态，甚至脑电波输入（马斯克的脑机接口就可以把人的大脑活动作为输入）。

这个 note 中，我们只展示最简单的读取用户输入的方法，而把重点放在对用户输入的处理上。

`input` 函数¶

请不要花太多时间深究这个函数，因为在真实开发中几乎不怎么会用到。

input 函数可以从 terminal 里读取用户的输入。它的用法是这样：

user_input = input("请输入一个字符串：　"）

运行程序，电脑会先 print 出 请输入一个字符串，暂停程序的运行，允许用户在 terminal 中输入，直到用户输入完毕，按下回车键。
用户的所有输入都会作为字符串，保存在 user_input 这个变量中。

这样，我们就成功把用户的输入保存在了一个变量里。通过对这个变量的处理，我们可以达成与用户的互动。

`open` 函数¶

这个函数非常非常非常有用，而且也不难！

open 函数会打开一个文件。这使得我们可以从文件中读取内容。

注意，用这个 open 函数打开文件和我们在 windows 桌面上双击打开有一些区别。
windows 上双击打开文件，原理大概是：

application = select_open_with(file) # 根据文件的类型，选择打开方式
file = open(file_name)               # 用我们的 open 函数打开这个文件，得到文件变量
application.process(file)            # 用我们选择的打开方式处理这个文件

换言之，我们的 open 函数，只是 windows 双击打开文件中的一小个环节而已。

具体地，假设要打开一个文件 some_file.txt 并读取它里面的所有内容，我们只需要让被打开的文件与 .py 文件在同一目录下，然后运行以下代码

file = open('some_file.txt')
lines = file.readlines()

注意：当你第二次运行 readlines() 的时候，将不会读出任何内容，就好像文件是空的一样。因为用 open 打开的文件只能被读取一次。至于更深层的原因，我们有机会再探讨。

字符串处理¶

拿到字符串之后，我们就可以开始进行处理了。

字符串的索引¶

首先，一个字符串有一点点像是一个列表（[ ] 包裹着的一堆数据）。

In [1]:

my_string = "Hello world！" # 回忆一下，我们是从 0 开始数数的！
print(my_string[1])
print(my_string[3])
print(my_string[5]) # 第五个字符是一个空格哦！
print(my_string[7])

e
l
 
o

In [2]:

print(my_string[1:5]) # 拿到从 1 到 5 的子字符串，包括 1 但不包括 5
print(my_string[-2])  # 拿到倒数第二个字符
print(len(my_string)) # 获取 my_string 的长度

ello
d
12

字符串的分割与拼接¶

我们可以用 split 把一段文字分开来：

In [3]:

my_string = "You guys made my day"
broken = my_string.split(" ")
print(broken)

['You', 'guys', 'made', 'my', 'day']

你可以选择不同的分割标准（当遇到某个特殊字符串的时候，进行分割）：

In [4]:

my_string = "You-guys-made-my-day"
broken = my_string.split("-")
print(broken)

my_string = "You%!%!guys%!%!made%!%!my%!%!day"
broken = my_string.split("%!%!")
print(broken)

['You', 'guys', 'made', 'my', 'day']
['You', 'guys', 'made', 'my', 'day']

你也可以用 join 反向操作，用一个字符串把一个列表连接起来，成为一个完整的字符串：

In [5]:

broken = ["You", "guys", "made", "my", "day"]
print(" ".join(broken))
print("----".join(broken))
print("".join(broken))

You guys made my day
You----guys----made----my----day
Youguysmademyday

字符串的清理¶

很多时候，你拿到的字符串是很“脏”的，使用起来很不方便。这时候，你可能需要“清理一下”。比如，在 equation-solver 项目的第三题中，假设你要让用户自己输入一个猜测的初始值（initial guess），你可能需要这么做：

In [6]:

def solve_equation(f):
    user_input = input("请输入你的初始猜测： ")  # 获得用户输入
    x0 = float(user_input)  # 把用户的字符串输入转换成一个小数（float 意思是浮点数） 
    x = x0
    while abs(x - x0) > 1e-6:
        # 做一些事情
        return None

问题在于，你不知道用户的输入会是什么。假设 user_input 是 "3.14abc"，那你的程序就会垮掉。你需要把这样的事情考虑在内，然后提示用户，输入的东西不对（试想一下，如果在什么 app 上填表，不小心在 日期 一栏填了字母，app 直接原地消失（闪退），那你会多抓狂）。

为了清理这些输入，我们需要用到 正则表达式 (regular expression)。

正则表达式¶

首先，我们通过例子看看什么是 格式（或 规则）：

手机号的格式（格式１）： 11 位纯数字，如13588888888
电子邮箱的格式（格式２）： [数字+字母的组合]@[数字+字母的组合].com，如 a3b4@out2look5.com

我们常用的 格式，用中文描述相当容易，人也很容易判断一个字符串的格式的正确与否。但是，要用程序描述规则，并且让程序判断一个字符串是否符合格式，那就得费一番周折。

好在，通过 正则表达式，我们很容易实现对 格式 的判断。

上面两种 格式，写作正则表达式，就是:

^[0-9]{11}$
^[0-9A-Za-z]+@[0-9A-Za-z]+\.com$

我们来解释一下手机号的 格式：

^ 意思是匹配开头。如果去掉 ^，那么 aaa13588888888 也会被当作符合 格式。
$ 意思是匹配结尾，如果去掉 $，那么 13588888888aaa 也会被当作符合 格式。
[0-9] 意思是数字 0-9，是 [0123456789] 的缩略写法。如果把它改成 [1357]，那么，13577533571 符合格式，而任何包含其它数字的都不符合格式。
{11} 意思是恰好匹配 11 次。{11,} 是至少 11 次，{, 11} 是最多 11 次。

再看邮件地址的格式

[0-9A-Za-z] 是匹配所有数字、大写字母、小写字母。也可以写成 [0-9A-z]。
+ 意思是至少一次。也可以写成 {1,}。

正则表达式可用的语法非常非常多，但我们只需要掌握其中几个关键的操作，即可应付绝大部分的需求。

在 Python 中，我们可以用它进行替换、搜索等等：

用正则表达式替换¶

In [7]:

import re

sentence = "This is the 123th of all 456 sentences written in 2021."
 
result1 = re.sub('[0-9]{3} ', "1000 ", sentence)    
# 把所有三位数替换成 1000
# 注意[0-9]{3}后面的空格！如果不加这个空格，就会把 2021 改成 10001
# 注意在 1000 后面也有个空格，不然前面就会少空格。

print("把所有三位数替换成 1000　之后: " + result1)

print("两个失败案例： ")
print(re.sub('[0-9]{3}', "1000", sentence))
print(re.sub('[0-9]{3} ', "1000", sentence))

把所有三位数替换成 1000　之后: This is the 123th of all 1000 sentences written in 2021.
两个失败案例： 
This is the 1000th of all 1000 sentences written in 10001.
This is the 123th of all 1000sentences written in 2021.

用正则表达式搜索¶

In [8]:

result = re.search(r"[0-9]+[a-z]+", sentence) # 搜索数字+字母（匹配到 124th 这个字符串）
if result: 
    print("Yes!")
else:
    print("No...")    

print(result) # 搜索成功这个 result，是一个 Match Object，但在逻辑判断中被当成 True 来用

Yes!
<_sre.SRE_Match object; span=(12, 17), match='123th'>

In [9]:

result = re.search(r"[0-9]+[a-z]{3,}", sentence)
if result:
    print("Yes!")
else:
    print("No...")    
    
print(result) # 搜索失败，什么都不返回（所以是 None，在逻辑判断中 None 被当作 False）来用

No...
None

`字典` 的使用¶

在 Python 中，一个 字典（Dictionary) 就跟现实生活中的字典一样，分为 索引（key） 和 内容（value）。

比如，要查 'serendipity' 这个单词是什么意思，你会问一个 字典：“请告诉我 'serendipity' 这个单词的意思”。

在这里，'serendipity' 就是一个 索引（key）。

之后，字典会把对应的 内容（value），也就是 'serendipity' 这个单词的解释、例句等等相关的信息返还给你。

翻译成代码就是：

In [10]:

my_dictionary = dict() # 创建一个新的字典
my_dictionary["ice"] = "冰。The form water takes when its temperature is below 0 at STP"
my_dictionary["would"] = "愿意。Used in hypothetical situations or when describing the future from the point of view of the past"
my_dictionary["serendipity"] = "美丽的巧合。Something that is quite unexpected, yet desirable."
# 一个字典可能会有很多很多很多词...
# ...

print(my_dictionary["serendipity"])

美丽的巧合。Something that is quite unexpected, yet desirable.

可以看一下字典的全部内容：

In [11]:

print("\n字典的全部索引：")
print(my_dictionary.keys())
print("\n字典的全部内容：")
print(my_dictionary.values())
print("\n字典的全部索引以及内容：")
print(my_dictionary.items())

字典的全部索引：
dict_keys(['ice', 'would', 'serendipity'])

字典的全部内容：
dict_values(['冰。The form water takes when its temperature is below 0 at STP', '愿意。Used in hypothetical situations or when describing the future from the point of view of the past', '美丽的巧合。Something that is quite unexpected, yet desirable.'])

字典的全部索引以及内容：
dict_items([('ice', '冰。The form water takes when its temperature is below 0 at STP'), ('would', '愿意。Used in hypothetical situations or when describing the future from the point of view of the past'), ('serendipity', '美丽的巧合。Something that is quite unexpected, yet desirable.')])

当然，编程的时候，我们需要把 字典 这个概念抽象出来。索引可以是任何东西，不只是单词。内容也是。所以，字典 里可以存放任何东西，甚至可以再放一层 字典。

In [12]:

another_dict = dict()
another_dict[4] = "sdgsg"
another_dict[(29, 121)] = "Ningbo"
another_dict["浙江"] = {
    1: "杭州",
    3: "宁波",
    5: "温州",
    9: "台州"
}

print(another_dict["浙江"][5])
print(another_dict[(29, 121)])

温州
Ningbo

案例¶

In [13]:

def isValid(user_input):
    if not user_input.isdecimal():
        return False
    else:
        return True 

print(isValid("3.14abc"))

False

isdecimal 这个函数可以告诉你，这个字符串是不是一串浮点数（亦即，可不可以被转换成 float 数）但是，它并不完全好用：

In [14]:

my_string = " -3.14　　　"
print(isValid(my_string)) # isdecimal 认为它不合法
print(float(my_string))   # 但它确实可以被转成 my_string

False
-3.14

float 是会自动对字符串进行一定清理的。但只限于正负号、空格等。

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search