:: IT

20200310 python (함수, 사용자 입출력, 파일 읽고 쓰기, 클래스, 상속 ,오버라이딩, 오버로딩) 2020.03.19
20200319 python pandas(데이터 전처리) 2020.03.19
20200308~20200309 python 기초 2020.03.19
20191231 if문, 반복문(for문), 메소드 2020.03.18
20191230 JAVA 시작 2020.03.18
[한글2010] :: 한글 2010의 단축키 2017.10.13

20200310 python (함수, 사용자 입출력, 파일 읽고 쓰기, 클래스, 상속 ,오버라이딩, 오버로딩)

GOGO치삼 2020. 3. 19. 17:32

2020. 3. 19. 17:32

hw_0310_.html

0.29MB

20200310 - Jupyter Notebook.pdf

0.62MB

20200310.html

0.34MB

#1.커피머신 프로그램
coffee=10
price =300
# 300 coffee
# 200 반환
# 500 coffee 200 잔돈
# 10잔 소진 종료(더이상 커피가 없습니다.)

while True:
    money =int(input("돈을 넣어주세요:"))
    if money==300:
        print("커피를 줍니다.")
        coffee=coffee-1
        print('커피가 %d 가 남았습니다' %coffee)
    elif money>300:
        print("거스름돈 %d를 주고 커피를 줍니다." %(money-300))
        coffee=coffee-1
        print('커피가 %d 가 남았습니다' %coffee)
    else:
        print("돈을 다시 돌려주고 커피를 주지 않습니다.")
        print('남은 커피의 양은 %d개입니다.' %coffee)
    if coffee==0:
        print("커피다 다 떨어졌습니다. 판매를 중지합니다.")
        break

돈을 넣어주세요:100

돈을 다시 돌려주고 커피를 주지 않습니다.

남은 커피의 양은 10개입니다.

돈을 넣어주세요:1235

거스름돈 935를 주고 커피를 줍니다.

커피가 9 가 남았습니다

#함수
#입력값이 없고 결과값만 있는 함수
def say():
    return 'hi'
a=say()
say()
# print(a)

'hi'

#결과값(return)없는 함수
#print문은 함수의 구성 요소중 하나인 수행할 문장에 해당하는 부분일 뿐임
def sum(a,b):
    print('%d,%d의 합은 %d 입니다.' %(a,b,a+b))
    
sum(3,4)

3,4의 합은 7 입니다.

#여러 개의 입력값을 받는 함수 만들기
#*args 처럼 입력 변수명 앞에*을 붙이면 입력값들을 전부 모아서 튜플로 만들어줌
def sum_many(*args):
    sum=0 #초기값
    for i in args:
        # sum+=i 컨트롤 + /
        sum=sum+i
    return sum

sum_many(1,2,3,4,5)

#여러개의 입력값을 받는 함수
def sum_mul(choice,*args):
    if choice =='sum':
        result=0
        for i in args:
            result+=i
    elif choice=='mul':
            result=1 #곱셈은 초기값이 1
            for i in args:
                result*=i
    return result
print(sum_mul('sum',1,2,3,4,5))
print(sum_mul('mul',1,2,3,4,5))

120

#1.뺄셈 나눗셈 멀티 계산 함수를 작성
#('sub' 2,4,6,8,10) 
#('div' 2,4,6,8,10) 

def sub_div(choice,*args):
    if choice =='sub':
        result=100
        for i in args:
            result-=i
    elif choice=='div':
            result=100
            for i in args:
                result/=i
    return result
print(sub_div('sub',2,4,6,8,10))
print(sub_div('div',2,4,6,8,10))

0.026041666666666668

#함수의 반환(결과값)은 언제나 하나임 - 결과값으로 튜플 값 하나를 갖게됨
def sum_and_mul(a,b):
    return a+b,a*b
result = sum_and_mul(3,4)
print(result

(7, 12)

#하나의 튜플 값을 2개의 결과값처럼 받고 싶다면 다음과 같이 함수를 호출
result1,result2=sum_and_mul(3,4)
print(result1)
print(result2)

7 12

#함수는 return문을 만나는 순간 결과 값을 돌려준 다음
#함수를 빠져나가며 두번째 return 문이 실행되지 않음
def sum_and_mul(a,b):
    return a+b
    return a*b
result=sum_and_mul(2,3)
print(result)
#return은 한번 밖에 쓰지 못한다. 그렇기 때문에 위처럼 result1,result2 를 사용 
#하여 2개의 이상의 결과값을 받는다.

#문자열을 출력한다는 것과 리턴값이 있다는 것은 전혀 다른 말임
def say_nick(nick):
    if nick=='바보':
        return '아니야!'
    print('나의 별명은 %s 입니다' %nick)
say_nick('야호')
say_nick('바보')

나의 별명은 야호 입니다

Out[27]:

'아니야!'

#입력 인수에 초기값 미리 설정하기
def say_myself(name, old, man =True):
    print('나의 이름은 %s입니다' %name)
    print('나의 나이는 %d살입니다' %old)
    if man:
        print('남자입니다.')
    else:
        print('여자입니다.')
        
# say_myself('홍길동',20) man이라는 변수에는 입력값을 주지 않았지만
# 초기값 true값을 값게된
# say_myself('홍길동',20,true)
say_myself('홍길동',20)

나의 이름은 홍길동입니다

나의 나이는 20살입니다

남자입니다.

#초기값을 설정해 놓은 인수 뒤에 초기 값을 설정해 놓지 않은 입력 인수는 사용x
#(name,man=true, old)는 오류를 발생
def say_myself(name, man=True, old):
    print('나의 이름은 %s입니다' %name)
    print('나의 나이는 %d살입니다' %old)
    if man:
        print('남자입니다.')
    else:
        print('여자입니다.')
say_myself('홍길동',20)
#초기값은 항상 끝자리에 와야한다.

#함수 안에서 선언된 변수의 효력범위
a=1
def vartest(a):
    a=a+1
print(vartest(a))
print(a)
#리턴값이 없어서 안나옴

None

#함수 안에서 함수 밖에 변수를 변경하는 방법
#return 을 이용하는 방법
a=1
def vartest(a):
    a=a+1
    return a
a=vartest(a)
print(a)
#올바른 코딩~

#글로벌 명령을 이용하는 방법
#글로벌 a 라는 문장은 함수 안에서 함수 밖에 a 변수를 직접사용하겠다는 의미
#함수는 독립적으로 존재하는 것이 좋기 때문에 외부변수에 종속적인 함수는 비추천
a = 1
def vartest():
    global a #밖에 있는 a값을 가져옴
    a=a+1
    
vartest()
print(a)
#쓰지마세요~안좋은 코딩

#사용자 입력과 출력
a=input()
a

501

'501'

number =input('숫자를 입력하세요:')

숫자를 입력하세요:5

while 1:
    data=input()
    if not data: break

#큰 따옴표로 둘러싸인 문자열은 +연산과 동일함
#문자열 띄어쓰기는 콤마로 함
print("life" "is" "too" "short")
print("life"+"is"+"too"+"short")
print("life", "is", "too", "short")

lifeistooshort

life is too short

#한줄에 결과값 출력하기
for i in range(10):
    print(i,end=' ')
print("\n")
for i in range(10):    
    print(i)

#파일 읽고 쓰기
f=open('test1.txt','w')#w,r,a수정
f.close()

f=open('test1.txt','w')
for i in range(1,11):
    data ='%d번째 줄입니다. \n' % i
    f.write(data)
f.close()
#쓰기

#프로그램의 외부에 저장된 파일을 읽는 방법
f=open('test1.txt','r')
line = f.readline()
print(line)
f.close()
#한줄 읽기

1번째 줄입니다.

#readlines 함수는 파일의 모든 줄을 읽어서 각각의 줄을 요소로 갖는 리스트로 돌려준다.
#f.readline()과는 달리 s가 하나 더 붙어 있음에 유의
f=open('test1.txt','r')
lines = f.readlines()
for line in lines:
    print(line)
f.close()

# r.read()는 파일의 내용 전체를 문자열로 돌려준다.
f = open('test1.txt', 'r')
data = f.read() # 읽어온걸 data에 저장
print(data) # 입력받은 data를 출력
f.close() # 읽고나면 반드시 클로즈 해줘야 한다

# 파일에 새로운 내용 추가하기
f=open('test1.txt','a') # a는 이미 있는 파일에 내용 추가
for i in range(11, 16):
    data = '%d 번째 줄입니다. \n' % i 
    f.write(data)
f.close()

#with문을 사용하면 with block을 벗어나는 순간 객체가 f가 자동으로 close됨
with open('test2.txt','w') as f:
      f.write('파이썬은 재미있습니다.')

with open('test2.txt','r') as f:
    data=f.read()
    print(data)

파이썬은 재미있습니다.

#class의 필요성
result = 0

def add(num):
    global result
    result += num
    return result
print(add(3))
print(add(4))

#2개의 계산시가 필요한 상황
result1=0
result2=0
def adder1(num):
    global result1
    result1 += num
    return result1
def adder2(num):
    global result2
    result2 += num
    return result2
print(adder1(3))
print(adder1(4))
print(adder2(3))
print(adder2(7))
#이런 중복을 피하기 위해서 클래스를 쓴다.

#Calculator클래스로 만들어진 
#cal1,cal2라는 별개의 계산기(인스턴스)가 각각의 역할수행
# class를 이용하면 계산기의 개수가 늘어나도 인스턴스를 생성하기만 하면됨
class Calculator:
    def __init__(self):
        self.result = 0 #기본생성자

    def add(self, num):#클래스안에서 정의 할때는 항상 self를 넣는다.
        self.result += num #this.result
        return self.result

cal1 = Calculator() #cal1만 놓고 봤을 땐 객체, Calculator에서 값이 넣는 것으로 볼 땐 인스턴스
cal2 = Calculator()

print(cal1.add(3))
print(cal1.add(4))
print(cal2.add(3))
print(cal2.add(7))

#앞에서 보았던 Calculator클래스에 빼기 기능 추가
class Calculator:
    def __init__(self): 
        self.result = 0

    def add(self, num):
        self.result += num
        return self.result
    
    def sub(self, num):
        self.result -= num
        return self.result

cal1 = Calculator() #cal1만 놓고 봤을 땐 객체, Calculator에서 값이 넣는 것으로 볼 땐 인스턴스
cal2 = Calculator()

print(cal1.add(3))
print(cal1.add(4))
print(cal1.add(5))
print('\n')
print(cal2.add(3))
print(cal2.add(4))
print(cal2.add(5))

# 사칙연산 클래스 만들기
# 객체에 숫자 지정할 수 있게 만들기
# setdata 메서드에는 self, first, second 총 3개의 매개변수가 필요한데 실제로는
# a.setdata(4, 2) 처럼 2개 값만 전달.
# 이유는 a.setdata(4, 2) 처럼 호출하면 setdata 메서드의 첫 번째 매개변수, self 에는 setdata 메서드를
# 호출한 객체 a 가 자동으로 전달되기 때문임.
# 메서드의 첫 번째 배개변수 self를 명시적으로 구현하는 것은 파이썬만의 독특한 특징

class FourCal:
    def setdata(self, first, second): # 메서드의 매개변수
        self.first = first #메서드의 수행문
        self.second = second #메서드의 수행문
a = FourCal()
a.setdata(4,2)
print(a.first)
print(a.second)

b=FourCal()
b.setdata(3,7)
print(b.first)
print(b.second)

생성자(Constructor) 란 객체가 생성될 때 자동으로 호출되는 메서드를 의미

파이썬 메서드 이름으로 init를 사용하면 이 메서드는 생성자가 된다.

init 메서드는 setdata메서드와 이름만 다르고 모든게 동일하나 메서드 이름을 init로

했기 때문에 생성자로 인식되어 객체가 생성되는 시점에 자동으로 호출

init 메서드도 다른 메서드와 마찬가지로 첫 번째, 매개변수 self에 생성되는 객체가 자동으로 전달

init 메서드가 호출되면 setdata 메서드를 호출했을 때와 마찬가지로

first와 second 라는 객체변수가 생성

#사칙연산 클래스 만들기
class FourCal:
    def __init__(self,first,second): #기본생성자
        self.first = first
        self.second = second
        
#     def setdata(self,first,second): #데이터를 셋팅시킬 수 있음
#         self.first = first
#         self.second = second
        
    def add(self):
        result = self.first + self.second
        return result
    
    def minus(self):
        result = self.first - self.second
        return result
    
    def mul(self):
        result = self.first * self.second
        return result
    
    def divide(self):
        result = self.first / self.second
        return result

class HouseHong:
    lastname='홍'
pey = HouseHong()
pes = HouseHong()
print(pey.lastname)
print(pes.lastname)

홍

class HouseHong:
    lastname = '홍'
    def setname(self, name):
        self.fullname = self.lastname + name
    def travel(self, where):
        print('%s,%s여행을 가다' %(self.fullname,where))
pey = HouseHong()
pey.setname('길동')
pey.travel('제주도')

홍길동,제주도여행을 가다

#오류 발생은 travel 함수가 self.fullname이라는 변수를 필요로 하기 때문임
pey = HouseHong()
pey.travel('제주도')

# __init__ 메서드를 이용하면 인스턴스를 만드는 동시에 초기값을 줄 수 있음(중요!!)
#setname은 pey = HouseHong() 객체를 선언하고 값을 넣어주지만
#__init__는 선언 자체만으로 객체선언을 할 필요 없이 값이 들어가게 된다.

class HouseHo:
    lastname='호'
    def __init__(self, name):
        self.fullname = self.lastname + name
        
    def travel(self,where):
        print('%s, %s여행을 가다' %(self.fullname, where))
        
pey = HouseHo('딜런')
pey.travel('제주도')

호딜런, 제주도여행을 가다

/*딜러니~~~^^~~*/

class HouseHo:
    lastname='우리'
    def __init__(self, name):
        self.fullname = self.lastname + name
        
    def travel(self,where):
        print('%s, %s여행을 가다' %(self.fullname, where))
        
mr = HouseHo('보비')
mr.travel('제주도')

우리보비, 제주도여행을 가다

#클래스의 상속(가져다 쓰고 싶을 때) // 인터페이스= 변수명 항상 반복해서 쓸 때
class HouseOh(HouseHo): #HouseHo는 상속받는 클래스
    lastname='오'

ys=HouseOh('쵸비')
ys.travel('울릉도')
#윗 클래스를 실행한 후 실행 가능

오쵸비, 울릉도여행을 가다

#메서드 오버라이딩(부모클래스의 메소드를 수정하는 것)
# 동일한 이름의 travel함수를 HouseOh 클래스 내에서 다시 구현
# 이렇게 메서드 이름을 동일하게 다시 구현하는 것을 오버라이딩이라고 함
class HouseOh(HouseHo): 
        lastname='오'
        def travel(self,where, day):
            print('%s, %s여행을 %s일에 가다' %(self.fullname, where,day))

ch=HouseOh('예쁘니')
ch.travel('서울',3)

오예쁘니, 서울여행을 3일에 가다

#연산자 오버로딩()
#연산자를 객체(클래스)끼리 사용할 수 있게 하는 기법
#+연산자를 객체에 사용하면 _add_라는 함수가 호출됨

class HouseLee:
    lastname ="이"
    def __init__(self,name):
        self.fullname=self.lastname+name
        
    def travel(self,where):
        print('%s, %s여행을 가다' %(self.fullname, where))
        
    def love(self,other):
        print('%s,%s 사랑에 빠졌네'%(self.fullname, other.fullname))
        
    def __add__(self, other): #클래스+클래스 일때 정의하는 것
        print('%s,%s 결혼했네'%(self.fullname, other.fullname))

class HouseSung(HouseLee):
    lastname='성'
    def travel(self, where, day):
        print('%s, %s 여행 %d일 가네.' %(self.fullname, where, day))
        
        
mr = HouseLee('몽룡')# 객체 클래스 불러오는 것 new

ch = HouseSung('춘향')
ch.travel('이탈리아', 30)

mr.love(ch) #mr의 love에서 ch의 정보를 받는다.

mr + ch #__add__

성춘향, 이탈리아 여행 30일 가네.

이몽룡,성춘향 사랑에 빠졌네

이몽룡,성춘향 결혼했네

저작자표시 비영리 변경금지

':: IT > python' 카테고리의 다른 글

20200316 python 판다스(pandas) 기초 (시리즈와 데이터프레임) (0)	2020.03.20
20200320 python (전처리_시계열데이터) (0)	2020.03.20
20200311 python (묘듈, 예외처리, 내장함수, map, 람다) (0)	2020.03.19
20200319 python pandas(데이터 전처리) (0)	2020.03.19
20200308~20200309 python 기초 (0)	2020.03.19

20200319 python pandas(데이터 전처리)

GOGO치삼 2020. 3. 19. 17:09

2020. 3. 19. 17:09

20200319_판다스(데이터 잔처리) - Jupyter Notebook.pdf

0.47MB

데이터 전처리

누락 데이터 처리
중복 데이터 처리
데이터 표준화
범주형 데이터 처리
정규화
시계열 데이터

누락 데이터 처리

20200319_판다스(데이터 잔처리).html

0.35MB

import seaborn as sns # seaborn 은 그래프화 해주는 라이브러리. 얘가 dataset을 제공을 해준다.

#titanic 데이터셋 가져오기
titanic_df = sns.load_dataset('titanic') 
display(titanic_df.head())
print(titanic_df.shape) # 891행, 15열
display(titanic_df.isnull().sum()) # null이 각각 몇개인지
display(titanic_df.info())
print(type(titanic_df))

#value_counts
titanic_df.survived.value_counts()

0 549

1 342

Name: survived, dtype: int64

Q.deck 열의 NaN개수를 계산하세요.

#deck 열의 NaN 개수 계산하기
nan_deck = titanic_df['deck'].value_counts(dropna=False)
#nan_deck=df['deck'].value_counts()
print(nan_deck)
display(titanic_df['deck'].isnull().head())
print(titanic_df['deck'].isnull().sum(axis=0))#행방향으로 찾기

Q.titanic_df의 처음 5개 행에서 null값을 찾아 출력하세요(True/False)

# isnull()메소드로 누락데이터 찾기
display(titanic_df.head().isnull())
print()
display(titanic_df.head().notnull())

Q.titanic_df의 'deck' 칼럼의 null의 개수를 구하세요

# isnull()메소드로 누락데이터 개수 구하기
print(titanic_df.survived.isnull().sum(axis=0))
print(titanic_df.deck.isnull().sum(axis=0))

688

# 각 칼럼별 null 개수
print(titanic_df.isnull().sum(axis=0))

Q. titanic_df의 각 칼럼별 null의 개수를 for반복문을 사용해서 구한 후 출력하세요.¶

(missing_count는 예외처리하고 처리 방식은 0을 출력함)

# thresh = 500: NaN값이 500개 이상인 열을 모두 삭제
#     - deck 열(891개 중 688개의 NaN 값)
# how = 'any' : NaN 값이 하나라도 있으면 삭제
# how = 'all':  모든 데이터가 NaN 값일 경우에만 삭제

df_thresh = titanic_df.dropna(axis=1, thresh=500)
# df1.dropna(axis=0) # NaN row 삭제
# df1.dropna(axis=1) # NaN column 삭제

print(df_thresh.columns)

Index(['survived', 'pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked', 'class', 'who', 'adult_male', 'embark_town', 'alive', 'alone'], dtype='object')

Q. embark_town 열의 NaN값을 바로 앞에 있는 828행의 값으로 변경한 후 출력하세요.

# embark_town 열의 NaN값을 바로 앞에 있는 828행의 값으로 변경하기
import seaborn as sns

titanic_df = sns.load_dataset('titanic')
titanic_df.embark_town.isnull().sum()
#print(titanic_df['embark_town'][827:831])
print(titanic_df.embark_town.iloc[827:831])
print()
titanic_df['embark_town'].fillna(method='ffill', inplace=True)
# Nan 데이터 0으로 채워줌
print(titanic_df['embark_town'][827:831])

중복 데이터 처리

# 중복데이터를 갖는 데이터 프레임 만들기
import pandas as pd
df = pd.DataFrame({'c1':['a','a','b','a','b'],
                  'c2':[1,1,1,2,2,],
                  'c3':[1,1,2,2,2]})
print(df)

#데이터 프레임 전체 행 데이터 중에서 중복값 찾기
df_dup = df.duplicated()
print(df_dup)

# 데이터프레임의 특정 열 데이터에서 중복값 찾기
col_dup_sum = df['c2'].duplicated().sum()
col_dup = df['c2'].duplicated()
print(col_dup_sum,'\n') # c2에서 중복되는 값이 몇개인지 확인
print(col_dup)

Q. df에서 중복행을 제거한 후 df2에 저장하고 출력하세요.

# 데이터 프레임에서 중복행 제거 : drop_duplicates()
print(df)
print()
df2= df.drop_duplicates()
print(df2)

Q. df에서 c2, c3열을 기준으로 중복행을 제거한 후 df3에 저장하고 출력하세요.

# print(df)
# print()
# df3= df[['c2','c3']].drop_duplicates()
# print(df3)

df3 = df.drop_duplicates(subset=['c2','c3'])
print(df3)
#[1,1] , [1,2],[2,2] 를 한묶음으로 봐서 같은 행을 삭제 한다.

데이터 단위 변경

# read_csv() 함수로 df 생성
import pandas as pd
auto_df = pd.read_csv('auto-mpg.csv')
auto_df.head()

#mpg(mile per gallon)를 kpl(kilometer per liter)로 반환 (mmpg_to_kpl=0.425)
mpg_to_kpl = 1.60934/3.78541
print(mpg_to_kpl)

0.42514285110463595

Q. 'mpg'를 'kpl'로 환산하여 새로운 열을 생성하고 처음 3개 행을 소수점 아래 둘째 자리에서 반올림하여 출력하시오.

import pandas as pd

# read_csv() 함수로 df 생성
auto_df = pd.read_csv('auto-mpg.csv', header=None)

# 열 이름을 지정
auto_df.columns = ['mpg','cylinders','displacement','horsepower','weight','acceleration','model year','origin','name'] 

display(auto_df.head(3))
print('\n')

auto_df['kpl'] = round((auto_df['mpg']*mpg_to_kpl),2)
display(auto_df.head(3))

# 각 열의 자료형 확인
print(auto_df.dtypes)

Q.horsepower 열의 고유값을 출력하세요.

print(auto_df['horsepower'].unique()) 
# '?' 때문에 horsepower       object로 모든 것을 인식한다.

Q. horsepower 열의 누락 데이터 '?'을 삭제한 후 NaN 값의 개수를 출력하세요.

# 누락 데이터 ('?') 삭제
import numpy as np
auto_df['horsepower'].replace('?',np.nan,inplace=True) #'?'을 np.nan으로 변경
auto_df.dropna(subset=['horsepower'], axis=0, inplace=True) # 누락데이터 행을 제거
auto_df['horsepower'].isnull().sum()
print(auto_df.horsepower.dtypes)

object

Q. horsepower'문자열을 실수 형으로 변환 후 자료형을 확인하세요.

auto_dff=auto_df['horsepower'].astype('float')
print(auto_dff)
print(auto_dff.dtypes)

Q. 아래 사항을 처리하세요

# origin 열의 고유값 확인
print(auto_df['origin'].unique())

# 정수형 데이터를 문자형 데이터로 변환
auto_df['origin'].replace({1:'USA',2:'EU',3:'JAPEN'},inplace=True)

print(auto_df['origin'].unique())

[1 3 2]

['USA' 'JAPEN' 'EU']

origin 열의 자료형을 확인하고 범주형으로 변환하여 출력하세요.

# 연속형 (1,2,3,4,5..)/범주형('AG','BG'..)
print(auto_df['origin'].dtypes)

object

#origin 열의 문자열 자료형을 범주형으로 전환
auto_df['origin']=auto_df['origin'].astype('category')
print(auto_df['origin'].dtypes)

Q.origin열을 범주형에서 문자열로 변환한 후 자료형을 출력하세요

### origin 열의 자료형을 확인하고 문자형에서 범주형으로 변환하여 출력하세요.
auto_df['origin']=auto_df['origin'].astype('str') #str 대신 object를 사용 할 수 있다.
print(auto_df['origin'].dtypes)

object

Q.model year열의 정수형을 범주형으로 변환한 후 출력하세요

# auto_df['model year'].unique() : [70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82]
display(auto_df['model year'].sample(3))
auto_df['model year'] = auto_df['model year'].astype('category')
display(auto_df['model year'].sample(3))

# 범주형(카테고리)데이터 처리
auto_df['horsepower'].replace('?',np.nan,inplace=True)
auto_df.dropna(subset=['horsepower'], axis=0, inplace=True)
auto_df['horsepower']=auto_df['horsepower'].astype('float')
auto_df['hp']=auto_df['horsepower'].astype('int')
print()
auto_df.info()

범주형(카테고리) 데이터 처리

auto_df['horsepower'].replace('?', np.nan, inplace=True) 
auto_df.dropna(subset=['horsepower'], axis=0, inplace=True) 
auto_df['horsepower'] = auto_df['horsepower'].astype('float') 
auto_df['hp'] = auto_df['horsepower'].astype('int') 
print() 
auto_df.info()

# np.histogram 함수로 3개의 bin으로 나누는 경계값의 리스트 구하기 
count, bin_dividers = np.histogram(auto_df['horsepower'], bins=3) # 3개로 나누기 때문에 bins=3 
display(bin_dividers) # [ 46. , 107.33333333, 168.66666667, 230.] : 46 ~ 107, 107 ~ 168, 168 ~ 230 총 
print()
 
# 3개의 bin에 이름 지정 
bin_names = ['저출력', '보통출력', '고출력']
 
# pd.cut 함수로 각 데이터를 3개의 bin에 할당 
auto_df['hp_bin'] = pd.cut(x=auto_df['horsepower'],  # 데이터 배열                          
                           bins=bin_dividers,         # 경계 값 리스트                          
                           labels=bin_names,          # bin 이름                          
                           include_lowest=True)       # 첫 경계값 포함
 
# horsepower 열, hp_bin 열의 첫 5행을 출력 
display(auto_df[['horsepower', 'hp_bin']].head())

array([ 46. , 107.33333333, 168.66666667, 230. ])

더미 변수

-카테고리를 나타내는 범주형 데이터를 회귀분석 등 ml알고리즘에 바로 사용할 수 없는 경우 -컴퓨터가 인식 가능한 입력값으로 변환 숫자 0 or 1로 표현된 더미 변수 사용. -0,1은 크고 작음과 상관없고 어떤 특성 여부만 표시(존재 1, 비존재 0) -> one hot encoding(one hot vector로 변환한다는 의미)

# hp_bin 열의 범주형 데이터를 더미 변수로 변환 
horsepower_dummies = pd.get_dummies(auto_df['hp_bin']) 
horsepower_dummies.head(15)

from sklearn import preprocessing

# 전처리를 위한 encoder 객체 만들기
label_encoder = preprocessing.LabelEncoder()        #label encoder 생성
onehot_encoder = preprocessing.OneHotEncoder()      #one hot encoder 생성

# label encoder로 문자열 범주를 숫자형 범주로 변환
onehot_labeled = label_encoder.fit_transform(auto_df['hp_bin'].head(15))
print(onehot_labeled)
print(type(onehot_labeled))

[1 1 1 1 1 0 0 0 0 0 0 1 1 0 2]

# 2차원 행렬로 형태 변경
onehot_reshaped = onehot_labeled.reshape(len(onehot_labeled),1)
print(onehot_reshaped)
print(type(onehot_reshaped))

#희소행렬로 변환
onehot_fitted = onehot_encoder.fit_transform(onehot_reshaped)
print(onehot_fitted)
print(type(onehot_fitted))

정규화

각변수의 숫자 데이터의 상대적 크기 차이 때문에 ml분석결과가 달라질 수 있음.(a 변수 1 ~ 1000, b변수 0 ~ 1)
숫자 데이터의 상대적 크기 차이를 제거 할 필요가 있으며 각 열(변수)에 속하는 데이터 값을 동일한 크기 기분즈오 나누어 정교화 함
정규화 결과 데이터의 범위는 0~1 또는 -1 ~1(음수값이 있는 경우)
각열의 값/최대값 or (각 열의 값 - 최소값)/(해당 열의 최대값-최소값)표준화
평균이 0이고 분산(or표준편차)이 1인 가우시간 표준 정규분포를 가진 값으로 변환하는 것

표준화
평균이 0이고 분산(or표준편차)이 1인 가우시간 표준 정규분포를 가진 값으로 변환하는 것

# horsepower 열의 누락 데이터('?') 삭제하고 실수형으로 변환 
auto_df = pd.read_csv('auto-mpg.csv', header=None) 
auto_df.columns = ['mpg','cylinders','displacement','horsepower','weight','acceleration','model year','origin','name']  
auto_df['horsepower'].replace('?', np.nan, inplace=True)      #'?'을 np.nan으로 변경
auto_df.dropna(subset=['horsepower'], axis=0, inplace=True)   # 누락데이터 행을 제거 
auto_df['horsepower'] = auto_df['horsepower'].astype('float')  # 문자열을 실수형으로 변경
 
#horsepower 열의 통계 요약정보로 최대값(max)을 확인 
print(auto_df.horsepower.describe()) 
print()
 
# horsepower 열의 최대값의 절대값으로 모든 데이터를 나눠서 저장
 
auto_df.horsepower = auto_df.horsepower / abs(auto_df.horsepower.max())
 
print(auto_df.horsepower.head())
print() 
print(auto_df.horsepower.describe())

# read_csv() 함수로 df 생성 
auto_df = pd.read_csv('auto-mpg.csv', header=None) 
auto_df.columns = ['mpg','cylinders','displacement','horsepower','weight','acceleration','model yearr','origin','name']  
#horsepower 열의 누락 데이터('?') 삭제하고 실수형으로 변환 
auto_df['horsepower'].replace('?', np.nan, inplace=True)      #'?'을 np.nan으로 변경
auto_df.dropna(subset=['horsepower'], axis=0, inplace=True)   # 누락데이터 행을 제거 
auto_df['horsepower'] = auto_df['horsepower'].astype('float') # 문자열을 실수형으로 변환
 
# horsepower 열의 통계 요약정보로 최대값 (max)과 최소값(min)을 확인 
print(auto_df.horsepower.describe()) 
print()
 
# horsepower 각 열 데이터에서 해당 열의 최소값을 뺀 값을 분자, 해당 열의 최대값 - 최소값을 분모 
# 가장 큰 값은 역시 1
 
min_x = auto_df.horsepower - auto_df.horsepower.min() 
min_max = auto_df.horsepower.max() - auto_df.horsepower.min() 
auto_df.horsepower = min_x / min_max
 
print(auto_df.horsepower.head()) 
print() 
print(auto_df.horsepower.describe())

저작자표시 비영리 변경금지

':: IT > python' 카테고리의 다른 글

20200316 python 판다스(pandas) 기초 (시리즈와 데이터프레임) (0)	2020.03.20
20200320 python (전처리_시계열데이터) (0)	2020.03.20
20200311 python (묘듈, 예외처리, 내장함수, map, 람다) (0)	2020.03.19
20200310 python (함수, 사용자 입출력, 파일 읽고 쓰기, 클래스, 상속 ,오버라이딩, 오버로딩) (0)	2020.03.19
20200308~20200309 python 기초 (0)	2020.03.19

20200308~20200309 python 기초

GOGO치삼 2020. 3. 19. 15:19

2020. 3. 19. 15:19

hw_0309_ - Jupyter Notebook.pdf

0.20MB

20200308_20200309 - Jupyter Notebook.pdf

0.61MB

파이썬 소개

반 로썸이 개발한 인터프리터 언어
기업의 실무를 위해서 많이 사용되는 언어. 그 대표 예가 구글에서 만든 소프트웨어의 50% 이상이 파이썬으로 작성

파이썬 특징

프로그램 언어 중 사람이 생각하는 방식을 가장 잘 나타낸 언어가 파이썬이다. ex) in 4 in [1, 2, 3, 4] : print('4가 있습니다.')
문법 자체가 아주 쉽고 간결하며 사람의 사고체계와 매우 비슷함
프로그래머가 만들고자 하는 프로그램의 대부분을 파이썬으로 만들 수 있다.
100가지 방법으로 하나의 일을 처리할 수 있다면 파이썬은 가장 좋은 방법 1가지만 사용하는 것을 선호한다.
다른 것을 신경 쓸 필요 없이 내가 하고자 하는 부분에만 집중

파이썬으로 할 수 있는 일

시스템 유틸리티 제작
GUI 프로그래밍을 위한 도구들이 잘 갖추어져 있어서 GUI 프로그램 제작이 쉽다.(ex))윈도우 바탕화면)
C나 C++로 만든 프로그램을 파이썬에서 사용할 수 있으며 파이썬으로 만든 프로그램 역시 C나 C++에서 사용할 수 있다.
Numpy라는 수치 연산 모듈을 제공한다. 이 모듈은 C로 작성했기 때문에 파이썬에서도 수치 연산을 빠르게 할 수 있다.
오라클, MySQL, PostgreSQL 등의 데이터베이스에 접근하기 위한 도구를 제공
판다스 모듈(대표적인 라이브러리)을 사용하면 데이터 분석을 더 쉽고 효과적으로 할 수 있다.

파이썬으로 할 수 없는 일

복잡한 시스템 연산은 적합하지 않음
안드로이드, 아이폰 앱을 개발하는 것은 어려움

-https://themes.getbootstrap.com/

Bootstrap Themes Built & Curated by the Bootstrap Team.

Bootstrap Themes is a collection of the best templates and themes curated by Bootstrap’s creators. Our collection of templates include themes to build an admin, dashboard, landing page, e-commerce site, application, and more.

themes.getbootstrap.com

# 파이썬 자료형

# 리스트는 [] 로 표시하며, [] 안의 요소를 , 로 구분하여 순서있게 나열한다.

list1 = [1, 2, 3, 4, 5] # 숫자

list2 = ['a', 'b', 'c'] # 문자

list3 = [1, 'a', 2, 'b', 3, 'c', [1,2,3], ['a','b','c']] #숫자문자 섞어서, 리스트 안에 리스트 가능

print(list1)

print(list2)

print(list3)

list1[0] = 6

print(list1)

[1, 2, 3, 4, 5]

['a', 'b', 'c']

[1, 'a', 2, 'b', 3, 'c', [1, 2, 3], ['a', 'b', 'c']]

[6, 2, 3, 4, 5]

In [6]:

#지역 변수 선언

def myfunc():

print('안녕하세요')

list4 = [1, 2, myfunc]

print(list4)

list4[2]()

[1, 2, <function myfunc at 0x000002405267D558>]

안녕하세요

In [7]:

#리스트 수정, 변경, 삭제

a = [1, 2, 3]

a[2] =4

a[1:2] # 1 초과 2 이하

print(a)

[1, 2, 4]

In [8]:

#리스트 수정, 변경, 삭제

a = [1, 2, 3]

a[2] =4

a[1:2] = ['a', 'b', 'c']

print(a)

del a[1:4]

print(a)

[1, 'a', 'b', 'c', 4]

[1, 4]

In [13]:

a=[1,2,3,4,5,6,7,8,9]

#[15,5,6,7]

del a[0:3]

a[0]=15

del a[4:6]

print(a)

[15, 5, 6, 7]

In [15]:

#정렬(sort)

b=[2,3,4,1]

b.sort() #정방향으로 정렬

print(b)

c=['a','c','d']

c.sort(reverse=True) #역방향으로 정렬

print(c)

[1, 2, 3, 4]

['d', 'c', 'a']

In [6]:

d=['a','b','d']

d.reverse()

print(d)

print(sorted(d,reverse=True))

['d', 'b', 'a']

In [2]:

#리스트 요소의 인덱스 위치 반환

e=[1,2,3]

print(e.index(3))#3의 인덱스를 반환하는 것 3이 2번 배열에 들어가있다는 것을 표시

print(e.index(1))

In [10]:

f=[1,2,3]

print(f.pop())#배열의 마지막 요소이 출력되고 그 요소를 삭제 한다. # 컨트롤+엔터 실행키

print(f)

[1, 2]

In [7]:

#알트+엔터 생성키

g=[1,2,3,1] #리스트내에 x가 몇 개인지 조사하여 그 갯수를 반환

print(g.count(1))

In [11]:

h=[1,2,3]#리스트 확장하는 것

h.extend([4,5,])

print(h)

[1, 2, 3, 4, 5]

In [14]:

a=[1,2,3]

a.append(4)

print(a)

a.append([5,6])#리스트를 추가 할 수 있다.

print(a)

[1, 2, 3, 4]

[1, 2, 3, 4, [5, 6]]

In [17]:

#튜플

#리스트와 비슷한 성질을 가지고 있는 자료형이지만 요소값을 변경 할 수 없음

#프로그램이 실행되는 동안 그 값이 항상 변하지 않아야되는 경우

t1=(1,2,3,4,5) # () 튜플 []배열

t2=('a','b','c')

t3=(1,'a','abc',[1,2,3,4,5],['a','b','c'])

def myfunc():

print('안녕하세요')

t4=(21,2,myfunc)

#t1[0]=6

#print(t1)

print(t4)

t4[2]()

(21, 2, <function myfunc at 0x0000015AE20EC678>)

안녕하세요

In [22]:

t1 = (1,) # 1개의 요소만 가질 때는 요소뒤에 콤마를 붙여야 함

print(t1)

t1 = (1,2,3,4,5)

t2 = ('a','b','c')

t3 = (1,'a','abc', [1,2,3,4,5],['a','b','c'])

print(t1)

print(t2)

print(t3)

(1,)

(1, 2, 3, 4, 5)

('a', 'b', 'c')

(1, 'a', 'abc', [1, 2, 3, 4, 5], ['a', 'b', 'c'])

In [21]:

a=(1,2,3)

b=('a','b','c')

c=a+b

print(c)

print(c*3)

(1, 2, 3, 'a', 'b', 'c')

(1, 2, 3, 'a', 'b', 'c', 1, 2, 3, 'a', 'b', 'c', 1, 2, 3, 'a', 'b', 'c')

In [27]:

#사전

#사전은 키와 값을 하나의 요소로 하는 순서가 없는 집합

#시퀀스 자료형이 아니면 인덱싱으로 값을 접근 할 수 없음

dic1={'a':1, 'b':2,'c':3}

print(dic1['a'])

print(dic1['c'])

print(dic1)

print(len(dic1)) # 배열의 개수를 출력

{'a': 1, 'b': 2, 'c': 3}

In [33]:

#딕셔너리 관련 함수 {}로 사용함

a={'name':'pey','phone':'0123456789','birth':'1108'}

print(a.keys())

print(a.values())

print(a.items()) #key:value 쌍 얻기

print(a.get('name'))

print(a.get('nokey'))#nokey의 값이 없기 때문에 none으로 나온다.

dict_keys(['name', 'phone', 'birth'])

dict_values(['pey', '0123456789', '1108'])

dict_items([('name', 'pey'), ('phone', '0123456789'), ('birth', '1108')])

pey

None

In [38]:

#해당 key가 딕셔너리 안에 있는지 조사(in)

a={'name':'pey','phone':'0123456789','birth':'1108'}

print('name' in a)

print('email' in a)

True

False

In [43]:

#딕셔너리 안에서 찾으려고 하는 key값이 없는 경우에 미리 정해준 디폴트 값을 가져오게 하고 싶을 경우에는 get(x,'디폴트 값')을 사용

a={'name':'pey','phone':'0123456789','birth':'1108'}

print(a.get('foo','bar'))#foo가 없을 경우 디폴트 값 bar를 가져온다

print(a.clear())

print(a.get('name','bar'))#name이 있을 경우 name의 값 pey를 가져온다.

bar

None

bar

In [46]:

#set

#set 자료형은 순서가 없기 때문에 인덱싱으로 값을 얻을 수 없음

s1=set([1,2,3,4,5,6])

s2=set([4,5,6,7,8,9])

print(s1&s2)# &는 교집합의 의미

print(s1.intersection(s2)) #교집합의 다른 사용방법

print(s1|s2)#|는 합집합의 이미

print(s1.union(s2))#합집합의 다른 사용방법

{4, 5, 6}

{1, 2, 3, 4, 5, 6, 7, 8, 9}

In [49]:

#리스트로 출력하기

s3=list(s1&s2) #교집합

print(s3)

s4=list(s1|s2)#합집합

print(s4)

[4, 5, 6]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [52]:

#뺄셈

s1=set([1,2,3,4,5,6])

s2=set([4,5,6,7,8,9])

print(s1-s2)

print(s2-s1)

{1, 2, 3}

{8, 9, 7}

In [57]:

s1.add(7)# 값 1개 추가

print(s1)

s1.update([9,10])#값 여러개를 추가

print(s1)

s1.remove(10)

print(s1)

{1, 2, 3, 4, 5, 6, 7, 9}

{1, 2, 3, 4, 5, 6, 7, 9, 10}

{1, 2, 3, 4, 5, 6, 7, 9}

In [64]:

#이스케이프 문자

#키보드로 입력하기 어려운 기호들을 나타내기 위해 역슬래쉬'\'로 시작하는 문자

print('나는 파이썬을 사랑합니다. 파이썬은 자바보다 쉽습니다.')

print('나는 파이썬을 사랑합니다.\n파이썬은 자바보다 쉽습니다.')#띄어쓰기

print('Name:Jone \tSex:Male \tAge:22')#탭띄우기

print('이 문장은 화면 폭이 너무 길어 다음 줄로 넘깁니다. \그러나 문장을 한줄에서 보고 싶습니다.') # 자바의 +과 같은 의미

나는 파이썬을 사랑합니다. 파이썬은 자바보다 쉽습니다.

나는 파이썬을 사랑합니다.

파이썬은 자바보다 쉽습니다.

Name:Jone Sex:Male Age:22

이 문장은 화면 폭이 너무 길어 다음 줄로 넘깁니다. \그러나 문장을 한줄에서 보고 싶습니다.

In [67]:

#들여쓰기

listdata=['a','b','c']

if 'a' in listdata:

print('a가 listdata에 있습니다.')

print(listdata)

else:

print('a가 listdata에 존재하지 않습니다.')

a가 listdata에 있습니다.

['a', 'b', 'c']

In [107]:

#문자열 포멧팅

#숫자 대입

a="I eat %d apples."%3 #3을 %d에 대입하라

print(a)

#문자 대입

b='I eat %s apples.' %'five'

print(b)

number=3

c='I eat %d apples.' %number

print(c)

#2개 이상의 문자, 숫자를 대입할 경우

number=10

day='three'

d='I eat %d apples. so I was sick of %s days' %(number, day)

print(d)

I eat 3 apples.

I eat five apples.

I eat 3 apples.

I eat 10 apples. so I was sick of three days

In [109]:

# %s 포멧코드는 어떤 형태의 값이든 변환해 넣을 수 있음

#%s 는 자동으로 % 뒤에 있는 값을 문자열로 변경함 /숫자가 오더라도 문자열로 출력된다.

e='I have %s apples' %3

print(e)

f='rate is %s' %3.234

print(f)

I have 3 apples

rate is 3.234

In [112]:

# %를 나타내려면 반드시 %%로 써야 한다.

g='Error is %d%%.' %98

print(g)

Error is 98%.

In [115]:

# 정렬과 공백

a='%10s' %'hi' #전체 길이가 10개인 문자열 공간에서 오른쪽 정렬

print(a)

b='%-10s' %'hi'

print(b)

In [124]:

#소수점 표현하기

c='%0.2f' %3.4567892

print(c)

d='%10.4f' %3.4567892

print(d)

e='%10.2f' %3.4567892

print(e)

3.46

3.4568

3.46

In [130]:

# 문자열 관련 함수

a='hobby'

print(a.count('b')) #문자 개수를 세기 알파벳 b의 개수를 세기

a='Python is best choice'

print(a.find('b')) #b의 방 번호를 찾는 방법

print(a.find('k'))#k가 없는 경우는 -1이 반환 된다.

print(a.index('n'))

print(a.index('k'))#없는 경우는 오류가 발생한다.

-1

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-130-6999e625ec54> in <module> 6 print(a.find('k'))#k가 없는 경우는 -1이 반환 된다. 7 print(a.index('n')) ----> 8 print(a.index('k'))#없는 경우는 오류가 발생한다. ValueError: substring not found

In [133]:

#문자열 삽입(join)

a=','

a=a.join('abcd')

print(a)

a,b,c,d

In [135]:

# 소문자를 대문자로 변경

a='hi'

a=a.upper()

print(a)

# 대문자를 소문자로 변경

b='HI'

b=b.lower()

print(b)

In [140]:

# 왼쪽 공백 지우기(lstrip)

a=' hi'

print(a)

b=a.lstrip()

print(b)

# 오른쪽 공백 지우기(rstrip)

a='hi '

print(a)

b=a.rstrip()

print(b)

#양쪽 공백 지우기(strip)

a=' hi '

print(a)

c=a.strip()

print(c)

In [145]:

# 문자열 바꾸기

a='Life is too short'

print(a.replace('Life','My leg'))

# 문자열 나누기

a='Life is too short'

print(a.split())

b = 'a:b:c:d'

print(b.split(':'))

My leg is too short

['Life', 'is', 'too', 'short']

['a', 'b', 'c', 'd']

In [150]:

# format 함수(고급 포맷팅)

a = 'I eat {0} apples'.format(3) #여기서 0은 첫 번째라는 의미, .찍고 format(대입을 원하는 숫자 기입)

print(a)

f = '{0:<10}'.format('hello') #왼쪽으로 정렬하고 자리수는 10

print(f)

z = '{0:>10}'.format('hello') #오른쪽으로 정렬하고 자리수는 10

print(z)

i='{0:=^10}'.format('hi') #중간에 넣을때는 ^으로 넣는다. / 공백을 '='로 채우기

print(i)

k = '{0:0.4f}'.format(3.567892) # 소수점 아래 4자리에서 끊도록 반올림처리

print(k)

I eat 3 apples

hello

====hi====

3.5679

In [155]:

# 'I eat five apples'-> 'eight'

a='I eat %s apples' %'five'

print(a)

a='I eat {0} apples' .format('eight')

print(a)

I eat five apples

I eat eight apples

In [156]:

#'I ate 10 apples. so I was sick for three days' -> 5.'five'

number=5

day='five'

a='I ate {0} apples. so I was sick for {1} days' .format(number,day)

#c='I ate {} apples. so I was sick for {} days'.format(5,five)

#print(c) 괄호만 써도 가능

print(a)

I ate 5 apples. so I was sick for five days

In [153]:

m='{{and}}' .format()

print(m)

{and}

In [157]:

a='{0:>10}' .format('hello') # 오른쪽 정렬

b='{0:<10}' .format('hello') #왼쪽 정렬

c='{0:^10}' .format('hello')#가운데 정렬

print(a)

print(b)

print(c)

hello

In [160]:

#y=3.451235, 자리수 10

y=3.451235

k = '{0:<0.4f}'.format(y)

print(k)

j = '{0:>10.4f}'.format(y)

print(j)

3.4512

In [90]:

#1. 김철수씨의 주민번호는 881230-1078523이다.

#주민번호를 연월일 부분과 그 뒤의 숫자부분으로 나누어 출력

a='881230-1078523'

print(a.split('-'))

881230

1078523

In [80]:

#2.a[1,3,5,4,2] 리스트를 [5,4,3,2,1]로 변경해서 출력

a=[1,3,5,4,2]

a.sort(reverse=True)

print(a)

[5, 4, 3, 2, 1]

In [77]:

#3.['Life','is','too','short'] 리스트를 Life is too short

#문자열로 출력

a=['Life', 'is', 'too', 'short']

print(a[0]+' '+a[1]+' '+a[2]+' '+a[3])

In [81]:

#4.문자열을 LIST로 변환하여 ['Life','is','too','short']출력

a='Life is too short'

print(a.split(' '))

In [87]:

#5.[1,2,3] 리스트에 값 4를 추가하여 출력

a1=[1,2,3]

a1.append(4)

print(a1)

[1, 2, 3, 4]

In [97]:

#6. a 리스트에서 중복 숫자를 제거후 리스트로 출력

a=[1,1,1,2,2,3,3,3,4,4,5]

a=set(list(a))

print(a)

{1, 2, 3, 4, 5}

In [99]:

#7.a에서 80과 70을 각각 출력

a={'A':90,'B':80,'C':70}

print(a['B'])

print(a['C'])

In [163]:

#리스트 자료형 복사

#id 함수는 변수가 가리키고 있는 객체의 주소 값을 돌려주는 파이썬 내장 함수

a=[1,2,3]

b=a

a[1]=4

print(a)

print(b)

print(id(a)) #주소값(메모리주소)

print(id(b))

[1, 4, 3]

1489876716616

In [165]:

a=[1,2,3]

b=a[:]

a[1]=4

print(a)

print(b)

print(id(a))

print(id(b))

[1, 4, 3]

[1, 2, 3]

1489876863688

1489876214472

In [171]:

#제어문

#if-else

x=1

y=2

if x >=y:

print('x가 y보다 크거나 작습니다.')

else:

print('x가 y보다 작습니다.')

x가 y보다 작습니다.

In [174]:

#만약 3000원 이상의 돈을 가지고 있으면 택시를 타고 그렇지 않으면 걸어가라

money=2000

if money >=3000:

print('택시를 타고 가라')

else:

print('걸어가라')

걸어가라

In [179]:

#만약 돈이 3000원 이상 있거나 카드가 있으면 택시를 타고 그렇지 않으면 걸어가라

money=2000

card=1

if money>=3000 or card:

print('택시')

else:

print('걸음')

택시

In [182]:

#if~elif

x=1

y=2

if x>y:

print('x가 y보다 큽니다.')

elif x<y:

print('x가 y보다 작습니다.')

else:

print('x와 y는 같습니다.')

x가 y보다 작습니다.

In [185]:

#for문

scope=[1,2,3,4,5]

for x in scope:

print(x, end=' ')

1 2 3 4 5

In [187]:

#for~continue~break

scope=[1,2,3,4,5]

for x in scope:

print(x)

if x <3:

continue

else:

break

1 2 3

In [196]:

#range(1,11) 끝 숫자는 포함되지 않음

sum=0

for i in range(1,11):

sum+=i #(sum=sum+i), sum-=i,sum*=i, sum/=i

print(sum)

In [197]:

marks = [90, 25, 67, 45, 80]

for number in range(len(marks)):

if marks[number] < 60: continue

print('%d번 학생 축하합니다.' %(number+1))

1번 학생 축하합니다.

3번 학생 축하합니다.

5번 학생 축하합니다.

In [201]:

for i in range(2,10):

for j in range(1,10):

print(i*j, end=' ')

print('')

2 4 6 8 10 12 14 16 18 3 6 9 12 15 18 21 24 27 4 8 12 16 20 24 28 32 36 5 10 15 20 25 30 35 40 45 6 12 18 24 30 36 42 48 54 7 14 21 28 35 42 49 56 63 8 16 24 32 40 48 56 64 72 9 18 27 36 45 54 63 72 81

In [216]:

scope=[1,2,3,4,5]

for i in scope:

if i<4:

print(i)

print('perfect')

scope = [1,2,3,4,5]

for i in scope:

if i<4:

print(i)

elif i>4:

print('perfect')

perfect

In [227]:

# Q. 정수 1 부터 n 까지 더할 때, 그 합이 10만 보다 커지게 되는 n과 합을 구하시오.

n=1

sum=0

while 1: #while 이 1 이면 TRUE 라는 의미

sum = sum+n # 0 + 1

if sum > 100000: # 1은 100000보다 작지 않다.

print(n)

print(sum)

break

n=n+1

# 1 + 1 은 2가 되고 다시 처음부터 시작,

#그 때 100000보다 커지게 되는 n에 해당하는 값을 출력하고 break 하게 된다.

447

100128

In [1]:

#while문을 사용하여 1부터 1000까지의 자연수 중 3의 배수의 합 구하기

n=0

b=0

while 1:

if n > 1000:

print(b)

break

b = b+n

n = n+3

n=0

b=0

while 1:

if n==1000:

print(b)

break

elif n%3==0:

b = b+n

n=n+1

166833

저작자표시 비영리 변경금지

':: IT > python' 카테고리의 다른 글

20200316 python 판다스(pandas) 기초 (시리즈와 데이터프레임) (0)	2020.03.20
20200320 python (전처리_시계열데이터) (0)	2020.03.20
20200311 python (묘듈, 예외처리, 내장함수, map, 람다) (0)	2020.03.19
20200310 python (함수, 사용자 입출력, 파일 읽고 쓰기, 클래스, 상속 ,오버라이딩, 오버로딩) (0)	2020.03.19
20200319 python pandas(데이터 전처리) (0)	2020.03.19

20191231 if문, 반복문(for문), 메소드

GOGO치삼 2020. 3. 18. 21:31

2020. 3. 18. 21:31

1. if문

package Control;

public class MainClass {
	public static void main (String [] args) {
    	// 좋은 코드는 남이 봤을 때 보기 편한 코드가 좋은 코드.(길지라도) 주석으로 내용을 적어주자.

		String[] code={"JP", "FR", "JP", "US", "CN", "DE", "KR", "JP", "DE","KR"};
		System.out.println(code[0]);
        //방을 지정하여 그 방에 값을 가져온다.
        
		System.out.println(code.length);
        //배열의 총개수를 알수 있는 방법(length) 10개의 결과값을 가지고 있다.

		//일본(jp)지역에 근무(code)하고 있는 사원의 인원수를 알고 싶다. 찾을 때 마다 1씩 증가시킨다.count(int변수)누적값은 ;만 들어가면 안되며 0값을 넣어야한다. 
		int count = 0;
		count = count +1;

		//if문을 사용하는 법
		int count = 0;

		//문자를 서로 비교하여 같거나 다름을 나타 내려면 == 표시가 아니라 equals(기본적으로 제공해주는 '매소드'라고 부른다.)를 넣어 같음과 다름을 비교한다.
		// 코드 []방에 jp를 true와 false로 결과값을 내보낸다.

		if(code[0].equals("JP")){ //true
		count=count+1;//내용을 표시하는 것(개수를 셀 수 있도록)
		}

		if(code[1].equals("JP")){ //false 이 부분이 실행 불가로 됨
		count=count+1;
		}

		if(code[2].equals("JP")){ //true
		count=count+1;
		}

		if(code[3].equals("JP")){ //false
		count=count+1;
		}

		if(code[4].equals("JP")){ //false
		count=count+1;
		}

		if(code[5].equals("JP")){ //false
		count=count+1;
		}

		if(code[6].equals("JP")){ //false
		count=count+1;
		}

		if(code[7].equals("JP")){ //true
		count=count+1;
		}

		if(code[8].equals("JP")){ //false
		count=count+1;
		}

		if(code[9].equals("JP")){ //false
		count=count+1;
		}

		System.out.println(count);
        
		//count=count+1;-> count++;로 나타낼 수 있다.
		if(code[9].equals("JP")){ //false
		count=count++;
		}
        
 }//main end

2. 반복문(for)

package Control;

public class MainClass {
	public static void main (String [] args) {
 	//반복문 배우기 (for문)
    
		String[] code={"JP", "FR", "JP", "US", "CN", "DE", "KR", "JP", "DE","KR"};
		int[] age={27,34,28,26,41,28,42,29,29,32};

		/*1번 사원의 나이의 합
		int sum = 0;//누적을 시키는 것(합누적변수)

		//sum=sum+age[0];~~sum=sum+age[9];
		for(int a = 0; a < age.length; a++){//age 배열의 개수를 넣어야해서 age를 넣어야 함 
			sum = sum+age[a];
		}

		System.out.println("사원나이의 합 :"+sum);

		}*/
		
		//2번 국내("KR")에 근무라는 사원 나이의 합

		int sum = 0;//누적함수
		for(int m = 0; m < code.length; m++){//10번 반복 시키는 것
			if(code[m].equals("KR")){ //검색(for아님 if) 만약 KR이라면(실행이 됬다면)
				sum= sum+age[m];//그 age를 더하라(age의 m번방 값을 가져와라)
				}
			}

		System.out.println(sum);

		}
}

3. 이중 for문(구구단 예시)

package Method;

//파일 이름을 바꾸려면 파일 아이콘을 눌러서 f2를 누르면 된다. 그럼 해당된 내용이 모두 바뀐다.

public class BookClass {

	String author = "";//저자
	int price = 0; //정가
	int r_price=0; //판매가

		public BookClass(){//기본 생성자
			}

		public void realPrice(int d){//판매가 연산 메서드

			r_price = price-d;//할인 가격 (유동성)
		}
}

//사본을 만들 때 new를 붙여서 사용한다.

3. 메소드

3-1. BookClass.java

package Method;

//파일 이름을 바꾸려면 파일 아이콘을 눌러서 f2를 누르면 된다. 그럼 해당된 내용이 모두 바뀐다.

public class BookClass {

	String author = "";//저자
	int price = 0; //정가
	int r_price=0; //판매가

		public BookClass(){//기본 생성자
			}

		public void realPrice(int d){//판매가 연산 메서드

			r_price = price-d;//할인 가격 (유동성)
		}
}

//사본을 만들 때 new를 붙여서 사용한다.

3-1. MainClass.java

package Method;
public class MainClass {

public static void main(String[] args) {

	//사본을 보관하는 것을 객체라고 한다.크게 통틀어 변수이다.

	BookClass b1 = new BookClass();//사본

	b1.author = "저자";//내용을 넣어줌
	b1.price = 200;//내용을 넣어줌 원가
	b1.realPrice(120);//할인가를 넣어준 실제판매한가

	System.out.println("저자의 실제 판매가는 :"+ b1.r_price);//실제 판매가를 나타냄

	BookClass b2 = new BookClass();//사본2

	b2.author = "홍길동";
	b2.price = 150;
	b2.realPrice(100);

	System.out.println("홍길동의 실제 판매가는 :"+ b1.r_price);

	BookClass[] books= new BookClass[2]; //배열 안에 2개가 있다.공간을 만들어놓고 배열하는 방식 세 줄을 써야함

	books[0]=b1;
	books[1]=b2;	

	System.out.println(books[0].author);
	System.out.println(books[1].author);

	BookClass[] books2={b1,b2}; //넣음과 동시에 방을 만들어 배열하는 방식 한 줄로 가능해서 경제적임

	System.out.println(books2[0].author);
	System.out.println(books2[1].author);
	System.out.println(books[0].author+"||"+books2[0].author);

	}
}

저작자표시 비영리 변경금지

':: IT > JAVA' 카테고리의 다른 글

20200107 메소드, String (0)	2020.03.24
20200106 JAVA arraylist 배열 (0)	2020.03.23
20200103 JAVA (ArrayList, split, Integer.parseInt, remove) (0)	2020.03.22
20200102 JAVA Class(클래스, 메소드 기초, 배열) (0)	2020.03.22
20191230 JAVA 시작 (0)	2020.03.18

20191230 JAVA 시작

GOGO치삼 2020. 3. 18. 21:04

2020. 3. 18. 21:04

1. 화면 출력하기

package com;

public class TestClass1 {

	public static void main (String[]args){
		
		//자동실행(맥북) 단축키 쉬프트+커멘드+fn+f11
		//자동실행(윈도우) 단축키 컨트롤+f11
		
		//맥북 =Sysout 컨트롤+커멘드+스페이
		//윈도우 = syso 컨트롤+스페이스
		
		//한 칸 띄우기 
		System.out.println("자바수업 시작");
		
		System.out.print("자바수업 시작");
		System.out.print("자바수업 시작");
	}

}

자바수업 시작
자바수업 시작자바수업 시작

2. 클래스, 변수 타입, 기초 배열

package com;

public class TestClass2 {
	public static void main (String[]args){

	/*1.1개 이상의 클래스 
	 * 2. 1개의 파일 내부에는 반드시 메인 (main)이 있어야 한다
	 * 클래스는 1개 이상이 가능하지만 메인은 1개 여야 한다.*/
		int number;
		String text;
		boolean bool;
		
		number =100;
		text = number+"abc";
		System.out.println(text);
		System.out.println("text");
		
		//int가 5개 들어갈 수 있는 공간을 새롭게 만드는 것 
		int [] nums = new int[5];
		nums[2]=3;
		nums[0]=1;
		nums[3]=number; //100
		
		System.out.println(nums[2]);//3
		System.out.println(nums[3]);//100
		
		//String가 5개 들어 갈 수 있는 공간을 새롭게 만드는 것 new를 붙여줘야 한다.
		String[] strs = new String[5];
		strs[0]="10";
		
		//값을 정해줘야함
		int init = 0;
		int sum = init+100;
		System.out.println(sum);//100
		
	
	}
}

100abc
text
3
100
100

저작자표시 비영리 변경금지

':: IT > JAVA' 카테고리의 다른 글

20200107 메소드, String (0)	2020.03.24
20200106 JAVA arraylist 배열 (0)	2020.03.23
20200103 JAVA (ArrayList, split, Integer.parseInt, remove) (0)	2020.03.22
20200102 JAVA Class(클래스, 메소드 기초, 배열) (0)	2020.03.22
20191231 if문, 반복문(for문), 메소드 (0)	2020.03.18

[한글2010] :: 한글 2010의 단축키

GOGO치삼 2017. 10. 13. 00:09

2017. 10. 13. 00:09

내용	단축키
문서 탭 전환	Ctrl+Tap/ Ctrl+Shift+Tap
저장	Alt+S
문서 불러오기	Alt+O
문서정보 보기	Ctrl+Q, I
블록지정하기	F3, End / Shift 누른 상태로 방향키 F3 (2번 누르면 단어블록/3번 누르면 문단블록)
모든 선택	Ctrl+A
강제 페이지 나누기	Ctrl+Enter
문서 처음/끝 이동	Ctrl+PgUp/PgDn
오려두기	Ctrl+X
글자 모양	Alt+L
문단 모양	Alt+T
편집 용지	F7
인쇄	Alt+P
문자표	Ctrl+F10
한자 입력	F9
표 입력	Ctrl+N,T
블록 평균	Ctrl+Shift+A
계산식	Ctrl+N+F
자동 채우기	A
쪽 번호 매기기	Ctrl+N,P
각주	Ctrl+N,N
그림 사이즈 조정	선택 상태에서 Shift+방향키
그림 넣기	Ctrl+N,I
그림 고치기	Ctrl+N,K
맞춤법 검사	F8
띄어쓰기	Space Bar/Enter/Tap

저작자표시 비영리 변경금지

PREV 이전 1 ···7 8 9 10 NEXT 다음

:: IT

생성자(Constructor) 란 객체가 생성될 때 자동으로 호출되는 메서드를 의미

파이썬 메서드 이름으로 init를 사용하면 이 메서드는 생성자가 된다.

init 메서드는 setdata메서드와 이름만 다르고 모든게 동일하나 메서드 이름을 init로

했기 때문에 생성자로 인식되어 객체가 생성되는 시점에 자동으로 호출

init 메서드도 다른 메서드와 마찬가지로 첫 번째, 매개변수 self에 생성되는 객체가 자동으로 전달

init 메서드가 호출되면 setdata 메서드를 호출했을 때와 마찬가지로

first와 second 라는 객체변수가 생성

':: IT > python' 카테고리의 다른 글

데이터 전처리

누락 데이터 처리

Q.deck 열의 NaN개수를 계산하세요.

Q.titanic_df의 처음 5개 행에서 null값을 찾아 출력하세요(True/False)

Q.titanic_df의 'deck' 칼럼의 null의 개수를 구하세요

Q. titanic_df의 각 칼럼별 null의 개수를 for반복문을 사용해서 구한 후 출력하세요.¶

Q. embark_town 열의 NaN값을 바로 앞에 있는 828행의 값으로 변경한 후 출력하세요.

중복 데이터 처리

Q. df에서 중복행을 제거한 후 df2에 저장하고 출력하세요.

Q. df에서 c2, c3열을 기준으로 중복행을 제거한 후 df3에 저장하고 출력하세요.

데이터 단위 변경

Q. 'mpg'를 'kpl'로 환산하여 새로운 열을 생성하고 처음 3개 행을 소수점 아래 둘째 자리에서 반올림하여 출력하시오.

Q.horsepower 열의 고유값을 출력하세요.

Q. horsepower 열의 누락 데이터 '?'을 삭제한 후 NaN 값의 개수를 출력하세요.

Q. horsepower'문자열을 실수 형으로 변환 후 자료형을 확인하세요.

Q. 아래 사항을 처리하세요

origin 열의 자료형을 확인하고 범주형으로 변환하여 출력하세요.

Q.origin열을 범주형에서 문자열로 변환한 후 자료형을 출력하세요

Q.model year열의 정수형을 범주형으로 변환한 후 출력하세요

범주형(카테고리) 데이터 처리

더미 변수

정규화

':: IT > python' 카테고리의 다른 글

파이썬 소개

파이썬 특징

파이썬으로 할 수 있는 일

파이썬으로 할 수 없는 일

':: IT > python' 카테고리의 다른 글

':: IT > JAVA' 카테고리의 다른 글

':: IT > JAVA' 카테고리의 다른 글

티스토리툴바