字幕列表 影片播放 列印英文字幕 Hi and welcome! We at 365 Data Science specialize in data science trainings. We post videos weekly, so you can master indispensable skills for free! Alright, let’s get started! All right! What do you need to know about programming if you are just getting started? We all have to deal with certain tasks in our daily lives. Many we can solve on our own, while others, especially the ones that are more complicated, can be solved with the help of a computer. Assume you have defined a problem that must be solved and you know the steps that must be taken to solve it. Even if you could structure your logic perfectly and type a brilliant solution in English, the computer will not understand it, as it understands 1s and 0s only. No other symbols. Similar to a light switch – it recognizes two phases – on and off. To communicate a real-life problem to the computer, you need to create a specific type of text, called a source code or a human readable code, that software can read and then process to the computer in 0s and 1s. A program is a sequence of instructions that designate how to execute a computation. Therefore, the formal definition of programming is the following: Taking a task and writing it down in a programming language that the computer can understand and execute. You need not be a geek or a computer scientist to program. Actually, the subject of computer science is not the study of programming; these are different things, and this can confuse beginners. Computer science is about understanding what computers can do. Programming, instead, is the activity of telling computers to do something for us. Think about the world we live in today. There are more than a thousand programming languages out there, and each language is designed for carrying out specific tasks. So, depending on the sphere to which your problem applies, only some languages can be of good use. For instance, PHP is good for web programming but is not suitable for programming devices. C plus plus can definitely help you with the latter, while Python and R are some of the favorite tools of data scientists and people from the finance industry. When you meet an experienced programmer, don’t think he can programme in all languages out there. Instead, it is likely he can work with one or may be a few languages, but he has mastered them well. Right. But how does somebody become good at programming? First, programming requires problem-solving skills and involves abstract thinking. You are supposed to understand your task perfectly, and then break it down into a sequence of instructions (or smaller computational steps) that the machine can execute. For example, John is asked by his boss to do the following: create a program that adds 10 to any number his boss inputs with the keyboard. The correct reasoning would be: if x is the unknown provided, we need an output of x + 10. After you have created these steps, with the help of a programming language, you will type in beautifully organized lines of code. So, the second crucial thing to develop is mechanistic thinking. Unfortunately, computers can only execute what you ask them to do, and they won’t understand what you imply by the instructions you have provided. They will simply compute the code, without interpreting your output. Fortunately, we can do that, though. Humans can understand and interpret code instructions and adjust it whenever necessary. And this is why a solid knowledge about the syntax of a programming language and the ability to understand computer code is of paramount importance – it will positively affect your thinking process, allowing you to break down your problem into parts the computer can execute. In the example, we provided above, John must think of the following subtasks: first, he must define a function that takes x as an argument and then returns as an output a new variable equal to x + 10. This is how this problem can be solved. Regardless of the problem you are facing or the programming language you are using, your coding style is crucial! Remember that. Having three lines of code is straightforward to understand. However, in practice, you will likely work with hundreds of lines of code that must be sent to other people. If your work is difficult to read, unnecessarily complicated, full of variables’ names, conveying no meaning, it will be poorly received by other programmers. Therefore, throughout this course, we will pay attention to the best practices that will help you organize your code! Programming challenges are great as they develop your mechanistic thinking and problem-solving abilities. This involves formulating problems, breaking them down into meaningful steps, and communicating these steps to the computer in an organized way. Although it may be new to some of you, Python has been on the programming stage for over two decades. There are two main reasons you should learn Python. First, it has several technical advantages compared to other programming languages. And second, its practical application covers several industries. It is a powerful computational tool when we have to solve complicated tasks in the fields of finance, econometrics, economics, data science, and machine learning. Therefore, it is a perfect stepping stone for somebody who learns how to code and is determined to pursue a career as a data scientist. Here’s a slightly more technical description of Python. It is an open-source, general-purpose high-level programming language. Let’s break this definition into several pieces and try to understand each of these attributes. Open-source software (OSS): Open-source means it is free. Python has a large and active scientific community with access to the software’s source code and contributes to its continuous development and upgrading, depending on users’ needs. This is the main reason Python is cross-platform – it is available for all major operating systems: Windows, Mac, and Linux. The benefit of it is Python can be quickly applied anywhere. Domain-specific languages, like MATLAB and SAS, also used for solving financial and econometric tasks, are paid. This plays a role in a language’s popularity. General-purpose: Yes, we will dig deeper in one of Python’s specific applications – analysis of financial data. However, you should know there is a broad set of fields where it could be applied. For instance, Python can be used for web programming through the Django framework. Although this is beyond the scope of this course, you should be aware the wide scope of application and the interoperability with other programming languages could be an explanation why some large organizations have chosen Python as their main programming language. High-level: This is slightly more technical. Broadly speaking, computers can run programs written in low-level languages only, also called machine languages. So, a program written in a high-level language must be first interpreted into a low-level language before it can be executed. This process takes time. There is specialized software and applications that will do this interpretation for you. Nevertheless, the advantages of using a high-level language are huge! It is difficult to code and understand low-level programming languages. They are too technical. High-level languages employ syntax a lot closer to human logic, which makes the language easier to learn and implement. It allows the programmer to focus on the task at hand, instead of trying to figure out unreadable lines of code. To summarize the technical advantages that make Python a powerful programming language, often preferred over other programming languages, we can say the following: - It is free and constantly updated; - It can be used in multiple domains; - It does not require too much time to process calculations and has an intuitive syntax that allows for complex quantitative computations. What we’ve said so far demonstrates Python’s enormous practical applicability. It is one of the most popular programming languages in several fields. One of them is the world of finance. Just consider, today, banks and financial institutions spend more on technology than any other industry! Thousands of developers work in financial institutions to maintain existing software and build new programs. There is a growing demand for people who have solid knowledge about the world of finance and Python programming. It is clear we are living in the era of Big Data. People in different disciplines - economics, finance, computer science, marketing, and many more can retrieve huge amounts of data. We can talk about Big Data when we have millions of observations. In such situations, the computational capabilities of traditional data processing applications, like Microsoft Excel, become insufficient. We need a more powerful tool to tackle Big Data in more or less the same way, regardless of the field of application. Python is perfect for these situations, as it gives us flexibility. To conclude, Python’s popularity lies on two main pillars. One is that it is an easy-to-learn programming language designed to be highly readable, with a syntax quite clear and intuitive. And the second reason is its user-friendliness does not take away from its strength. Python can execute a variety of complex computations and is one of the most powerful programming languages preferred by specialists. For this course, you must install both Python and Jupyter on your computer. If you have them, you can still complete this lecture, because we will say a few interesting things about Jupyter. So, why isn’t there just one software application, called “Python”, you can install on your computer that is automatically being updated and that runs everything smoothly? I am sorry to tell you, but it’s not the case. We have to deal with reality. First, Python is a programming language. It can allow you to communicate with the computer. To do that, you’ll need the help of a specific software or an application. Namely, the Jupyter Notebook App, which is more often called Jupyter, can help us do that. It is a server-client application that allows you to edit your code through a web browser. Consider the following graph. All units represent different software. On one side, you have several language kernels. These are programs designed to read and execute code in a specific programming language, like Python, R, or Julia. The Jupyter installation always comes with an installed Python kernel, and the other kernels can be installed additionally. On the other side, you have various types of interfaces, where you can write code. They represent the clients. An example of such a client is the web browser. The Jupyter server provides the environment where a client is matched with a corresponding languages kernel. In our case, we will focus on Python, and a web browser as a client, or as an interactive shell. Your work will be stored on a notebook document, and since we will be strictly using the Python language, it will be called “IPython Notebook” file, with the file format “dot ipynb”. Having said all that, we can explain why Jupyter is used in so many large corporations, like Google, Microsoft, and IBM. For its design, it is well-suited to demonstrations of programming concepts and training. First, in large corporations, solving a particular task could require coding in a few languages, say Python, R, Julia, or PHP. Instead of installing different interfaces for each language kernel you need, Jupyter allows you to use the same structure of the notebook type of file. Simply, each notebook you create will connect to the language kernel you request. Consider also, this file can be easily stored locally or on a remote server. Therefore, Jupyter facilitates the communication between teams in a corporation tremendously. Second, Jupyter is not a text editor that opens a new window every time you execute a different part of your code, as is the case with some other software applications. In the same file, you can have pure text that can communicate a message to the reader, computer code like Python, and output containing rich text, like equations, figures, graphs, pictures, and others. This simplifies the process of the work flow immensely, and Jupyter Notebook is increasingly preferred over other software packages. That’s why we’ll use it too The next step would be to install Anaconda - a software package that contains both the Python programming language and the Jupyter Notebook App. There are various ways to install Python on your computer. But especially for new users, it is highly recommended to opt for Anaconda. It will install, not only Python, but also the Jupyter Notebook App and many scientific computing and data science packages. Let’s open www.continuum.io and click on the “Download Anaconda” button in the home page. You have to pick one of the three operating systems – Windows, Mac, or Linux. I will show you how to install Anaconda on Windows, but the procedure is identical if you are going to use the Mac or Linux version. Now, you must choose the best among the four provided options. Do you need a 2.7 or a 3.6 version? And then there is a 32-bit or a 64-bit version, depending on the Windows you have installed. I know it sounds strange to maintain not one but two versions of a single program. For this course, the differences between Python 2.7 and 3.6 will be almost insignificant. To prove this is the case, we have attached both Python 2 and Python 3 notebook files to all videos throughout the course where providing lecture code was relevant. So, which version should you install? Given that we have recorded the entire course on Python 2, install this version if you want to use notebook files containing the same code as the one displayed in the videos. If, instead, you are fine with dealing with just a couple of small syntactical changes, or there’s some other reason why you prefer to work with Python 3, then install that version, and later use the notebook files stored in the “Python 3” folder in the resources sections of the lectures. For example, if you access the additional material to the lecture about “Variables”, and you click on the folder called “Python 2”, you will see notebook files that will run on Python 2 directly. In the same manner, if you go back and select the “Python 3” folder, you will only encounter files that are suitable for working with Python 3. Remember that we have kept this principle throughout. It is valid also for the second, financial part of the course, which can be taken if you have any of the Python versions. So, it’s up to you which one to install now. Having said that, let’s go back to the installation process. When you choose between a 32 or 64-bit version, you can simply check your operating system. Newer computers are surely running with a 64-bit processors, but if you would like to verify this before you begin, you could open your control panel menu from the start window. Then, select the “System” icon and check the information referring to the “System Type”. In my case, it is 64-bit, so when I go back to the Anaconda website, I will select the 64-bit 2.7 Python version. Then, we must find a directory where we want to install the distribution and press “Save”. While waiting to complete the download, you can decide whether to leave your e-mail to Continuum. This is not a necessary step, so you could also select “No Thanks” and you are good to go. When the download has finished, please double-click on the file to run the application. What follows is nothing different from the standard windows installer. Agree and press “Next” until you have the chance to specify a destination folder. You could do this after selecting the “Browse” button. When satisfied with the indicated directory, click “Next” one more time. We suggest you tick both advanced options. The first option will automate more complicated processes, while the second will register Anaconda as a default Python on your computer if you have not installed some other package. Finally, click “Install” and proceed until you see the window where you can finalize the installation. We will not use “Anaconda Cloud” just yet, so I’ll untick the box. What you just installed is the whole Anaconda Distribution – the Python language, a text editor, many applications, and packages. You also have the Jupyter Notebook App. You can open the start menu and select the respective icon from there. A new window will pop up. It will take a few seconds for the App to load. Once this is done, your web browser will open a new tab with the Jupyter Dashboard. Ok, so in this lesson, we’ll do a quick tour of the Jupyter dashboard. As soon as you load the notebook, the Jupyter dashboard opens. Each file and directory has a check box next to it. By ticking and unticking an item, you could manipulate the respective object – that means you can duplicate or shutdown a running file. In addition, you can rename and delete folders. The selection menu allows you to select all the files in the console of the same type by expanding this button. For example, you could mark all the folders or all the running files. In addition, you can check this little box that will directly select all the items on the page. The logic regarding the directories management is the same as the one of an operation system – files can be grouped into folders, and folders can contain other folders. From the “Upload” button in the top-right corner, you can upload a notebook into the directory you are in. The standard explorer box opens and when you select a file, you can click on “Open”, and it will immediately appear in your directory. Finally, you can expand the “New” button. From the list that falls, you will most likely need to create a new text file, a new folder, or a new notebook file. A notebook file can contain code in any of the languages in your “Notebooks” section. When you create a new Python notebook file, it will be recorded in the IPython Notebook format, dot ipynb for brief. Don’t fear this other new name, IPython – think of it as the predecessor of Jupyter. Consider the IPython Notebook format as a legacy file format… All right, let’s carry out a few operations. I will rename this folder to “Exercises”. Ok. Done. Now, I will create a new IPython Notebook file by expanding the “New” button and selecting one of the two Python formats I have here. Let’s use the default Python format. The browser immediately opens a new tab for me. This is the interactive shell we mentioned earlier. Here, you will write your code and see its output. Ok. Great. Now that we know more about the dashboard, we are ready to examine the shell and see how we can code in Jupyter. The field you see here is called a cell. You can access a cell by pressing “Enter”. Once you’ve done that, you’ll be able to see the cursor, so you can start typing code. The grey box is called input field. Now, you are in “Edit mode”. The green cell border and the little pencil in the top right corner indicate that as well. To close “Edit mode”, you have to press the “Escape” button. As I press “Escape”, I’ll go back to “Command mode”, which will allow me to edit the notebook as a whole. This is why the cursor and the pencil disappeared. The cell border turned back to grey, and its left margin is blue. Now, I’ll press “Enter” again. I will put a short code that says x is a list composed of four numbers, 1, 2, 3, and 4. After that, I can ask the computer to print this list for me by typing “x”. I can execute these commands in two ways. The first one is to hold Ctrl and then press Enter. By doing this, the machine will execute the code in the cell, and I will “stay” there, meaning I will not have created or selected another cell. Observe that an output field with the same number as the input field appeared. Input and Output fields of the same number are grouped together. The output represents the machine’s response to your commands provided in the corresponding input and the respective field cannot be modified. The second option allows for a more fluid code writing. To execute the same code, hold “Shift” and then press “Enter”. The previous two commands are being executed and then a new cell where you can write code is created. If you use “Shift” and “Enter”, you can continue typing code easily. It is amusing that you can cut, copy, and paste cells. You can use the buttons on the main menu. Since we are keen supporters of ergonomics, we suggest you get used to the keyboard shortcuts. When coding, you will mainly have to use the keyboard, so it’s worth memorizing different keyboard combinations that would allow you to work faster. Let’s see how we can apply shortcuts in practice. I will cut this cell by selecting it, and then pressing the “X” key. Remember you can always use the arrow keys to navigate along your notebook file. So, I will move up and then directly press the “V” key to paste it. Voilà! My cell was moved up here, below the cell I had selected before I pressed the “V” key. Now, I will press “C” to copy the same cell. I will use the arrow keys again to go down this time, and at the end, I will press “V” again. Apparently, this cell was copied and pasted below. The other two buttons in the menu cannot be substituted by keyboard shortcut combinations. They allow you to move a selected cell up or down, just as I am doing right now. Observe that the corresponding output field moves along with its input cell. Lovely! Instead of carrying out a command with Shift and Enter, you might prefer clicking on this button with the mouse – the result will be the same. What is interesting is that, when you work with more complex code that requires tougher calculations from the computer, while running the code, a little star will appear over here, before the square brackets. Sometimes, this process might take too long to complete, so to stop it or break it, this classical “stop” symbol can do that for you. Three other shortcuts could accelerate your coding a lot. Suppose I have selected this cell. After I press the “A” key, a new empty cell will be inserted above. If I select the same cell and press “B”, another cell will be created just below. Imagine the last step was a mistake and I need to delete that empty cell. In this case, it has already been selected, so I can press the “D” key twice, and it will readily disappear. Wonderful! All the cells we saw so far were code cells. Let’s see what is a markdown cell. It is a cell that contains strictly documentation - text not executed as a code. It will contain some message you would like to leave to the reader of the file. To convert a selected cell into a markdown cell, you should either expand this list and opt for “markdown”, or simply press the “M” button. Press “Enter” to access the cell and type some text. When I run this cell, the output will be a simple statement. Now, if you want to turn a markdown cell back to a code cell, select it and opt for “Code” from this drop-down menu or press the “Y” key. To conclude, there are two advantages of using Jupyter. First, when your code becomes longer, markdown cells turn out to be useful, as they allow you to leave comments and explain the solution you’ve created. This is why practitioners love to use them. The other benefit of using this App is you can select and execute whichever cell you want to – you need not run all the previous cells to run a particular cell! This allows for solving a problem in pieces and saves a lot of computation time. This was a long but indispensable lesson. We went through everything you need to know before you start coding. Thank you for watching! If you found this video interesting and want to gain an edge in your career, make sure to like, comment, and subscribe! And don’t forget to check out some of our other videos for another quick win in the data science skills department!
B1 中級 初學者的Python教程。編程入門|安裝Python和Jupyter筆記本。 (Python Tutorial for Beginners: Introduction to Programming | Install Python and Jupyter Notebook) 4 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字