Compare commits

...

13 Commits

Author SHA1 Message Date
7a823377f4 .. 2025-06-02 08:42:19 +00:00
1ad58eaa81 bot deny, add hearder 2025-05-11 00:51:05 +09:00
084f46edd9 naver크롤링 && wp api를 이용한 임시글 등록구현 2025-01-19 23:58:30 +09:00
da51f38e3c 불필요한 파일 삭제 2024-10-04 01:36:45 +09:00
6b71884047 1차 완료 2024-10-04 01:04:32 +09:00
82b132f519 docker test setting 2024-10-04 00:28:36 +09:00
7f03ff861b 데모 코드 작성 완료 2024-10-04 00:02:07 +09:00
019a1419d1 환경변수 호출 및 db 조회 리팩토링 2024-10-03 21:26:06 +09:00
4d1fd7a5be 코드 리팩토링 2024-10-03 08:32:03 +09:00
2b0d3ebb6c git 경고 수정 2024-10-02 23:53:12 +09:00
34bf998113 샘플 환경파일 추가 2024-10-02 23:51:49 +09:00
6ff1eb043e html변환, wp api push 테스트 완료 2024-10-02 23:50:09 +09:00
39cf244de9 개발 순서 업데이트 2024-10-02 08:44:10 +09:00
32 changed files with 1067 additions and 139 deletions

15
Dockerfile Normal file
View File

@ -0,0 +1,15 @@
FROM python:3.10-slim-buster
WORKDIR /usr/src/app
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
COPY . /usr/src/app
# install python dependencies
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
# 6. Python main.py 파일을 실행합니다.
CMD ["python", "main.py"]

View File

@ -1,7 +1,24 @@
# wp-post-automation
워드프레스 포스팅 자동화 프로젝트.
make.com을 이용해서 만든 AutoMation Flow를 Python을 이용하여 변환.
## 기존 Flow
## 목적
* 워드프레스 포스팅 자동화 기능 구현.
## 주요내용
### 프로젝트 소개
2025.01.19 - 네이버 블로그 스크랩 및 워드프레스 임시글 등록 기능 구현
* 초기 기능 구현 완료.
2024.10.04 - 테스트 완료. 프로젝트 1차 종료.
* 워드프레스 포스팅 자동화 프로젝트.
* make.com을 이용해서 만든 AutoMation Flow를 Python을 이용하여 변환.
### 사용방법
* sample.env.dev를 복사하여 루트 디렉토리에 .env.prd | .env.dev | .env 중 하나로 생성하여 사용 합니다.
* main.py가 기본 자동화 프로세스 입니다.
* 크롤링된 게시물을 OpenAI가 변형하여 마크다운으로 등록합니다.
* main_naver_blog_html은 네이버 블로그 전용 프로세스 입니다.
* 크롤링된 게시물을 그대로 마크다운으로 등록합니다.
---
## 참고 워크 플로우
* MariaDB에 저장된 최신 참고 url정보를 얻어온다.
* HTTP모듈을 이용하여 참고 자료를 가져온다.
* 가져온 HTML형태의 자료를 Text만 추출한다.
@ -13,23 +30,48 @@ make.com을 이용해서 만든 AutoMation Flow를 Python을 이용하여 변환
* WordPress에 포스팅을 한다.
## 개발 계획
### 2025.01.19
* 네이버 블로그 포스트를 크롤링한다.
* API로 워드프레스에 임시글로 등록한다.
### 2024.10.04
* 기존 Flow를 Python으로 개발한다.
* 트리거가 발생하면 실행시키는 컨테이너로 빌드한다.
* kubectl create -f file.yaml을 이용하여 1회성 동작 하도록 구현한다.
### Python 개발 순서
* DB에서 url을 가져오는 코드작성.
* url을 이용해서 파싱하고 텍스트만 추출하는 기능 구현.
* OpenAI이용 코드 작성.
* HTML문서 변환 코드 작성.
* 워드프레스 등록 플로우 코드 작성.
* 코드 리팩토링.
## Python 개발 순서
### 2025.01.19
* 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다(완료).
* url을 이용해서 파싱하고 텍스트만 추출하는 기능 구현(완료).
* 마크다운 형태로 추출(완료).
* HTML문서 변환 코드 작성(완료).
* 워드프레스 API를 이용한 임시 포스트 등록(완료).
### 2024.10.04
* DB에서 url을 가져오는 코드작성(완료).
* url을 이용해서 파싱하고 텍스트만 추출하는 기능 구현(완료).
* OpenAI이용 코드 작성(완료)-비용 절감을 위하여 제목, 이미지 생성 제외.
* HTML문서 변환 코드 작성(완료).
* 워드프레스 등록 플로우 코드 작성(완료).
## 코드 리팩토링.
### 2024.10.04
* 전체 리팩토링(완료).
* 모듈화, 패키지화(완료).
## Docker Image Build
### 2024.10.04 업데이트
* Dockerfile 추가(완료).
## Kubernetes manifests
### 2024.10.04 업데이트
* 샘플 템플릿 작성(완료)
* 쿠버네티스 환경 테스트(완료).
---
## 코드 이슈
### 네이버 블로그 크롤링
2024.10.02 기준
* 현재 일반 뉴스 기사는 잘 동작되는 것으로 보임.
* 네이버 블로그는 js이슈로 크롤링이 안되는 것으로 추측. selenium검토 필요.
* 특이사항 없음.
---
## 라이선스
### 라이선스 검토 대상

View File

@ -0,0 +1,37 @@
import requests
from bs4 import BeautifulSoup
def get_naver_blog_content(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
# HTTP 요청
response = requests.get(mobile_url, headers=headers)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
return content.get_text(strip=True)
else:
print("Failed to extract the blog content.")
return None
# 예제 URL
url = "https://blog.naver.com/kte1909/223724132196"
blog_content = get_naver_blog_content(url)
if blog_content:
print("Blog Content:")
print(blog_content)

247
bin/Activate.ps1 Normal file
View File

@ -0,0 +1,247 @@
<#
.Synopsis
Activate a Python virtual environment for the current PowerShell session.
.Description
Pushes the python executable for a virtual environment to the front of the
$Env:PATH environment variable and sets the prompt to signify that you are
in a Python virtual environment. Makes use of the command line switches as
well as the `pyvenv.cfg` file values present in the virtual environment.
.Parameter VenvDir
Path to the directory that contains the virtual environment to activate. The
default value for this is the parent of the directory that the Activate.ps1
script is located within.
.Parameter Prompt
The prompt prefix to display when this virtual environment is activated. By
default, this prompt is the name of the virtual environment folder (VenvDir)
surrounded by parentheses and followed by a single space (ie. '(.venv) ').
.Example
Activate.ps1
Activates the Python virtual environment that contains the Activate.ps1 script.
.Example
Activate.ps1 -Verbose
Activates the Python virtual environment that contains the Activate.ps1 script,
and shows extra information about the activation as it executes.
.Example
Activate.ps1 -VenvDir C:\Users\MyUser\Common\.venv
Activates the Python virtual environment located in the specified location.
.Example
Activate.ps1 -Prompt "MyPython"
Activates the Python virtual environment that contains the Activate.ps1 script,
and prefixes the current prompt with the specified string (surrounded in
parentheses) while the virtual environment is active.
.Notes
On Windows, it may be required to enable this Activate.ps1 script by setting the
execution policy for the user. You can do this by issuing the following PowerShell
command:
PS C:\> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
For more information on Execution Policies:
https://go.microsoft.com/fwlink/?LinkID=135170
#>
Param(
[Parameter(Mandatory = $false)]
[String]
$VenvDir,
[Parameter(Mandatory = $false)]
[String]
$Prompt
)
<# Function declarations --------------------------------------------------- #>
<#
.Synopsis
Remove all shell session elements added by the Activate script, including the
addition of the virtual environment's Python executable from the beginning of
the PATH variable.
.Parameter NonDestructive
If present, do not remove this function from the global namespace for the
session.
#>
function global:deactivate ([switch]$NonDestructive) {
# Revert to original values
# The prior prompt:
if (Test-Path -Path Function:_OLD_VIRTUAL_PROMPT) {
Copy-Item -Path Function:_OLD_VIRTUAL_PROMPT -Destination Function:prompt
Remove-Item -Path Function:_OLD_VIRTUAL_PROMPT
}
# The prior PYTHONHOME:
if (Test-Path -Path Env:_OLD_VIRTUAL_PYTHONHOME) {
Copy-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME -Destination Env:PYTHONHOME
Remove-Item -Path Env:_OLD_VIRTUAL_PYTHONHOME
}
# The prior PATH:
if (Test-Path -Path Env:_OLD_VIRTUAL_PATH) {
Copy-Item -Path Env:_OLD_VIRTUAL_PATH -Destination Env:PATH
Remove-Item -Path Env:_OLD_VIRTUAL_PATH
}
# Just remove the VIRTUAL_ENV altogether:
if (Test-Path -Path Env:VIRTUAL_ENV) {
Remove-Item -Path env:VIRTUAL_ENV
}
# Just remove VIRTUAL_ENV_PROMPT altogether.
if (Test-Path -Path Env:VIRTUAL_ENV_PROMPT) {
Remove-Item -Path env:VIRTUAL_ENV_PROMPT
}
# Just remove the _PYTHON_VENV_PROMPT_PREFIX altogether:
if (Get-Variable -Name "_PYTHON_VENV_PROMPT_PREFIX" -ErrorAction SilentlyContinue) {
Remove-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Scope Global -Force
}
# Leave deactivate function in the global namespace if requested:
if (-not $NonDestructive) {
Remove-Item -Path function:deactivate
}
}
<#
.Description
Get-PyVenvConfig parses the values from the pyvenv.cfg file located in the
given folder, and returns them in a map.
For each line in the pyvenv.cfg file, if that line can be parsed into exactly
two strings separated by `=` (with any amount of whitespace surrounding the =)
then it is considered a `key = value` line. The left hand string is the key,
the right hand is the value.
If the value starts with a `'` or a `"` then the first and last character is
stripped from the value before being captured.
.Parameter ConfigDir
Path to the directory that contains the `pyvenv.cfg` file.
#>
function Get-PyVenvConfig(
[String]
$ConfigDir
) {
Write-Verbose "Given ConfigDir=$ConfigDir, obtain values in pyvenv.cfg"
# Ensure the file exists, and issue a warning if it doesn't (but still allow the function to continue).
$pyvenvConfigPath = Join-Path -Resolve -Path $ConfigDir -ChildPath 'pyvenv.cfg' -ErrorAction Continue
# An empty map will be returned if no config file is found.
$pyvenvConfig = @{ }
if ($pyvenvConfigPath) {
Write-Verbose "File exists, parse `key = value` lines"
$pyvenvConfigContent = Get-Content -Path $pyvenvConfigPath
$pyvenvConfigContent | ForEach-Object {
$keyval = $PSItem -split "\s*=\s*", 2
if ($keyval[0] -and $keyval[1]) {
$val = $keyval[1]
# Remove extraneous quotations around a string value.
if ("'""".Contains($val.Substring(0, 1))) {
$val = $val.Substring(1, $val.Length - 2)
}
$pyvenvConfig[$keyval[0]] = $val
Write-Verbose "Adding Key: '$($keyval[0])'='$val'"
}
}
}
return $pyvenvConfig
}
<# Begin Activate script --------------------------------------------------- #>
# Determine the containing directory of this script
$VenvExecPath = Split-Path -Parent $MyInvocation.MyCommand.Definition
$VenvExecDir = Get-Item -Path $VenvExecPath
Write-Verbose "Activation script is located in path: '$VenvExecPath'"
Write-Verbose "VenvExecDir Fullname: '$($VenvExecDir.FullName)"
Write-Verbose "VenvExecDir Name: '$($VenvExecDir.Name)"
# Set values required in priority: CmdLine, ConfigFile, Default
# First, get the location of the virtual environment, it might not be
# VenvExecDir if specified on the command line.
if ($VenvDir) {
Write-Verbose "VenvDir given as parameter, using '$VenvDir' to determine values"
}
else {
Write-Verbose "VenvDir not given as a parameter, using parent directory name as VenvDir."
$VenvDir = $VenvExecDir.Parent.FullName.TrimEnd("\\/")
Write-Verbose "VenvDir=$VenvDir"
}
# Next, read the `pyvenv.cfg` file to determine any required value such
# as `prompt`.
$pyvenvCfg = Get-PyVenvConfig -ConfigDir $VenvDir
# Next, set the prompt from the command line, or the config file, or
# just use the name of the virtual environment folder.
if ($Prompt) {
Write-Verbose "Prompt specified as argument, using '$Prompt'"
}
else {
Write-Verbose "Prompt not specified as argument to script, checking pyvenv.cfg value"
if ($pyvenvCfg -and $pyvenvCfg['prompt']) {
Write-Verbose " Setting based on value in pyvenv.cfg='$($pyvenvCfg['prompt'])'"
$Prompt = $pyvenvCfg['prompt'];
}
else {
Write-Verbose " Setting prompt based on parent's directory's name. (Is the directory name passed to venv module when creating the virtual environment)"
Write-Verbose " Got leaf-name of $VenvDir='$(Split-Path -Path $venvDir -Leaf)'"
$Prompt = Split-Path -Path $venvDir -Leaf
}
}
Write-Verbose "Prompt = '$Prompt'"
Write-Verbose "VenvDir='$VenvDir'"
# Deactivate any currently active virtual environment, but leave the
# deactivate function in place.
deactivate -nondestructive
# Now set the environment variable VIRTUAL_ENV, used by many tools to determine
# that there is an activated venv.
$env:VIRTUAL_ENV = $VenvDir
if (-not $Env:VIRTUAL_ENV_DISABLE_PROMPT) {
Write-Verbose "Setting prompt to '$Prompt'"
# Set the prompt to include the env name
# Make sure _OLD_VIRTUAL_PROMPT is global
function global:_OLD_VIRTUAL_PROMPT { "" }
Copy-Item -Path function:prompt -Destination function:_OLD_VIRTUAL_PROMPT
New-Variable -Name _PYTHON_VENV_PROMPT_PREFIX -Description "Python virtual environment prompt prefix" -Scope Global -Option ReadOnly -Visibility Public -Value $Prompt
function global:prompt {
Write-Host -NoNewline -ForegroundColor Green "($_PYTHON_VENV_PROMPT_PREFIX) "
_OLD_VIRTUAL_PROMPT
}
$env:VIRTUAL_ENV_PROMPT = $Prompt
}
# Clear PYTHONHOME
if (Test-Path -Path Env:PYTHONHOME) {
Copy-Item -Path Env:PYTHONHOME -Destination Env:_OLD_VIRTUAL_PYTHONHOME
Remove-Item -Path Env:PYTHONHOME
}
# Add the venv to the PATH
Copy-Item -Path Env:PATH -Destination Env:_OLD_VIRTUAL_PATH
$Env:PATH = "$VenvExecDir$([System.IO.Path]::PathSeparator)$Env:PATH"

69
bin/activate Normal file
View File

@ -0,0 +1,69 @@
# This file must be used with "source bin/activate" *from bash*
# you cannot run it directly
deactivate () {
# reset old environment variables
if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
PATH="${_OLD_VIRTUAL_PATH:-}"
export PATH
unset _OLD_VIRTUAL_PATH
fi
if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
export PYTHONHOME
unset _OLD_VIRTUAL_PYTHONHOME
fi
# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH:-}" -o -n "${ZSH_VERSION:-}" ] ; then
hash -r 2> /dev/null
fi
if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
PS1="${_OLD_VIRTUAL_PS1:-}"
export PS1
unset _OLD_VIRTUAL_PS1
fi
unset VIRTUAL_ENV
unset VIRTUAL_ENV_PROMPT
if [ ! "${1:-}" = "nondestructive" ] ; then
# Self destruct!
unset -f deactivate
fi
}
# unset irrelevant variables
deactivate nondestructive
VIRTUAL_ENV=/home/ubuntu/gitea-icurfer/wp-post-automation
export VIRTUAL_ENV
_OLD_VIRTUAL_PATH="$PATH"
PATH="$VIRTUAL_ENV/"bin":$PATH"
export PATH
# unset PYTHONHOME if set
# this will fail if PYTHONHOME is set to the empty string (which is bad anyway)
# could use `if (set -u; : $PYTHONHOME) ;` in bash
if [ -n "${PYTHONHOME:-}" ] ; then
_OLD_VIRTUAL_PYTHONHOME="${PYTHONHOME:-}"
unset PYTHONHOME
fi
if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT:-}" ] ; then
_OLD_VIRTUAL_PS1="${PS1:-}"
PS1='(wp-post-automation) '"${PS1:-}"
export PS1
VIRTUAL_ENV_PROMPT='(wp-post-automation) '
export VIRTUAL_ENV_PROMPT
fi
# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands. Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH:-}" -o -n "${ZSH_VERSION:-}" ] ; then
hash -r 2> /dev/null
fi

26
bin/activate.csh Normal file
View File

@ -0,0 +1,26 @@
# This file must be used with "source bin/activate.csh" *from csh*.
# You cannot run it directly.
# Created by Davide Di Blasi <davidedb@gmail.com>.
# Ported to Python 3.3 venv by Andrew Svetlov <andrew.svetlov@gmail.com>
alias deactivate 'test $?_OLD_VIRTUAL_PATH != 0 && setenv PATH "$_OLD_VIRTUAL_PATH" && unset _OLD_VIRTUAL_PATH; rehash; test $?_OLD_VIRTUAL_PROMPT != 0 && set prompt="$_OLD_VIRTUAL_PROMPT" && unset _OLD_VIRTUAL_PROMPT; unsetenv VIRTUAL_ENV; unsetenv VIRTUAL_ENV_PROMPT; test "\!:*" != "nondestructive" && unalias deactivate'
# Unset irrelevant variables.
deactivate nondestructive
setenv VIRTUAL_ENV /home/ubuntu/gitea-icurfer/wp-post-automation
set _OLD_VIRTUAL_PATH="$PATH"
setenv PATH "$VIRTUAL_ENV/"bin":$PATH"
set _OLD_VIRTUAL_PROMPT="$prompt"
if (! "$?VIRTUAL_ENV_DISABLE_PROMPT") then
set prompt = '(wp-post-automation) '"$prompt"
setenv VIRTUAL_ENV_PROMPT '(wp-post-automation) '
endif
alias pydoc python -m pydoc
rehash

69
bin/activate.fish Normal file
View File

@ -0,0 +1,69 @@
# This file must be used with "source <venv>/bin/activate.fish" *from fish*
# (https://fishshell.com/); you cannot run it directly.
function deactivate -d "Exit virtual environment and return to normal shell environment"
# reset old environment variables
if test -n "$_OLD_VIRTUAL_PATH"
set -gx PATH $_OLD_VIRTUAL_PATH
set -e _OLD_VIRTUAL_PATH
end
if test -n "$_OLD_VIRTUAL_PYTHONHOME"
set -gx PYTHONHOME $_OLD_VIRTUAL_PYTHONHOME
set -e _OLD_VIRTUAL_PYTHONHOME
end
if test -n "$_OLD_FISH_PROMPT_OVERRIDE"
set -e _OLD_FISH_PROMPT_OVERRIDE
# prevents error when using nested fish instances (Issue #93858)
if functions -q _old_fish_prompt
functions -e fish_prompt
functions -c _old_fish_prompt fish_prompt
functions -e _old_fish_prompt
end
end
set -e VIRTUAL_ENV
set -e VIRTUAL_ENV_PROMPT
if test "$argv[1]" != "nondestructive"
# Self-destruct!
functions -e deactivate
end
end
# Unset irrelevant variables.
deactivate nondestructive
set -gx VIRTUAL_ENV /home/ubuntu/gitea-icurfer/wp-post-automation
set -gx _OLD_VIRTUAL_PATH $PATH
set -gx PATH "$VIRTUAL_ENV/"bin $PATH
# Unset PYTHONHOME if set.
if set -q PYTHONHOME
set -gx _OLD_VIRTUAL_PYTHONHOME $PYTHONHOME
set -e PYTHONHOME
end
if test -z "$VIRTUAL_ENV_DISABLE_PROMPT"
# fish uses a function instead of an env var to generate the prompt.
# Save the current fish_prompt function as the function _old_fish_prompt.
functions -c fish_prompt _old_fish_prompt
# With the original prompt function renamed, we can override with our own.
function fish_prompt
# Save the return status of the last command.
set -l old_status $status
# Output the venv prompt; color taken from the blue of the Python logo.
printf "%s%s%s" (set_color 4B8BBE) '(wp-post-automation) ' (set_color normal)
# Restore the return status of the previous command.
echo "exit $old_status" | .
# Output the original/"old" prompt.
_old_fish_prompt
end
set -gx _OLD_FISH_PROMPT_OVERRIDE "$VIRTUAL_ENV"
set -gx VIRTUAL_ENV_PROMPT '(wp-post-automation) '
end

8
bin/pip Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
bin/pip3 Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

8
bin/pip3.10 Executable file
View File

@ -0,0 +1,8 @@
#!/home/ubuntu/gitea-icurfer/wp-post-automation/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from pip._internal.cli.main import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
sys.exit(main())

1
bin/python Symbolic link
View File

@ -0,0 +1 @@
python3

1
bin/python3 Symbolic link
View File

@ -0,0 +1 @@
/usr/bin/python3

1
bin/python3.10 Symbolic link
View File

@ -0,0 +1 @@
python3

79
dev.py Normal file
View File

@ -0,0 +1,79 @@
import requests
from bs4 import BeautifulSoup
from markdownify import markdownify as md
from package import GetConfig, MariaDB, ChangeTextToPost, WordPress
import markdown
# 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다.
def get_naver_blog_content_as_markdown(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
# 웹브라우저 위장 --------------------------------------------------
# 제외 하여도 이상 없이 동작하여 제외.
# headers = {
# "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
# }
# response = requests.get(mobile_url, headers=headers)
# ---------------------------------------------------------------
response = requests.get(mobile_url)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
html_content = str(content)
markdown_content = md(html_content) # HTML → Markdown 변환
# 빈 줄 제거
markdown_content = "\n".join([line for line in markdown_content.splitlines() if line.strip()])
return markdown_content
else:
print("Failed to extract the blog content.")
return None
# 2024-10-03 환경 변수 호출
print('### Get values From .env')
config = GetConfig()
dict_data = config.get_config_as_dict()
# 예제 URL
url = input("Enter your blog address : ")
# markdown_content = get_naver_blog_content_as_markdown(url)
post_article = get_naver_blog_content_as_markdown(url)
post_article = post_article.replace(">", "###")
# if markdown_content:
# print("Markdown Content:")
# print(markdown_content)
# Markdown 파일로 저장
# with open("blog_content.md", "w", encoding="utf-8") as file:
# file.write(markdown_content)
# print("Blog content saved as blog_content.md")
# print('### Convert to HTML - markdown to html')
# # 2024-10-03 Markdown을 HTML로 변환
# html = markdown.markdown(post_article)
# # 2024-10-03 워드프레스 포스팅 임시등록
# print('### Create post')
# wp = WordPress(dict_data)
# rs = wp.create_post(2,html)
# if __name__ == "__main__":
# # print(post_article)
# print("추가 확인을 위한 출력")
# if rs.ok:
# print(f"### 성공 code:{rs.status_code}")
# else:
# print(f"### 실패 code:{rs.status_code} reason:{rs.reason} msg:{rs.text}")

View File

@ -1,45 +0,0 @@
import mysql.connector
from dotenv import load_dotenv
import os
# .env.demo 파일 로드
load_dotenv(r'./.env.dev')
# 환경 변수 가져오기
host = os.getenv('DB_HOST')
user = os.getenv('DB_USER')
password = os.getenv('DB_PASSWORD')
database = os.getenv('DB_NAME')
# MariaDB에 연결하는 함수
def fetch_data_from_mariadb():
try:
# 데이터베이스 연결
connection = mysql.connector.connect(
host=host,
user=user,
password=password,
database=database
)
# 커서 생성
cursor = connection.cursor(dictionary=True)
# 쿼리 실행
query = "SELECT * FROM healty_url_source ORDER BY idx DESC LIMIT 1;"
cursor.execute(query)
# 결과 가져오기
result = cursor.fetchone()
return result
except mysql.connector.Error as err:
print(f"Error: {err}")
finally:
if connection.is_connected():
cursor.close()
connection.close()
if __name__ == "__main__":
# 결과 확인
data = fetch_data_from_mariadb()
print(data['url'])

16
k8s-manifests/env.yaml Normal file
View File

@ -0,0 +1,16 @@
apiVersion: v1
kind: Secret
metadata:
name: wp-secret
namespace: default
type: Opaque
data:
DB_HOST:
DB_USER:
DB_PASSWORD:
DB_NAME:
OPENAI_API_KEY:
WP_URL:
WP_USER:
WP_API_KEY:
WP_POST_STYLE: ""

61
k8s-manifests/pod.yaml Normal file
View File

@ -0,0 +1,61 @@
apiVersion: v1
kind: Pod
metadata:
name: wp-auto-pod
namespace: default
annotations:
sidecar.istio.io/inject: "false"
spec:
containers:
- name: wp-auto-container
image: harbor.icurfer.com/py_prj/wp-auto:0.0.1
imagePullPolicy: IfNotPresent
env:
- name: DB_HOST
valueFrom:
secretKeyRef:
name: wp-secret
key: DB_HOST
- name: DB_USER
valueFrom:
secretKeyRef:
name: wp-secret
key: DB_USER
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: wp-secret
key: DB_PASSWORD
- name: DB_NAME
valueFrom:
secretKeyRef:
name: wp-secret
key: DB_NAME
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: wp-secret
key: OPENAI_API_KEY
- name: WP_URL
valueFrom:
secretKeyRef:
name: wp-secret
key: WP_URL
- name: WP_USER
valueFrom:
secretKeyRef:
name: wp-secret
key: WP_USER
- name: WP_API_KEY
valueFrom:
secretKeyRef:
name: wp-secret
key: WP_API_KEY
- name: WP_POST_STYLE
valueFrom:
secretKeyRef:
name: wp-secret
key: WP_POST_STYLE
restartPolicy: Never
imagePullSecrets:
- name: harbor-icurfer-private

1
lib64 Symbolic link
View File

@ -0,0 +1 @@
lib

52
main.py Normal file
View File

@ -0,0 +1,52 @@
import package as pkg
from package import GetConfig, MariaDB, ChangeTextToPost, WordPress
import markdown
# 2024-10-03 환경 변수 호출
print('### Get values From .env')
config = GetConfig()
dict_data = config.get_config_as_dict()
# 2024-10-03 db에서 url정보 호출
# DB없이 url을 직접 넣어서 동작시켜도 가능합니다. - 2025.01.19
print('### Get URL From DB')
db = MariaDB(dict_data)
url = db.fetch_data_from_mariadb()['url'] # 최근 항목 조회. - 2025.01.19
print(url)
# 2024-10-03 url을 이용해서 text추출
print('### Get content From URL')
origin_content = pkg.getContents(url)
print(origin_content)
# # 2024-10-03 openAI를 이용하여 게시글 스타일 변경
print('### Convert to Post - openAI')
openai_key = dict_data['openai_api_key']
print(f"### OpenAI Key : {openai_key}")
wp_reference_style = dict_data['wp_post_style']
print(f"### WP Reference Style : {wp_reference_style}")
open_ai = ChangeTextToPost(openai_key)
post_article = open_ai.generate_blog_post(origin_content, wp_reference_style)
print('### DEBUG ###')
print(post_article)
# print('### Convert to HTML - markdown to html')
# # 2024-10-03 Markdown을 HTML로 변환
# html = markdown.markdown(post_article)
# # 2024-10-03 워드프레스 포스팅 임시등록
# print('### Create post')
# wp = WordPress(dict_data)
# rs = wp.create_post(2,html)
if __name__ == "__main__":
print(post_article)
print("추가 확인을 위한 출력")
if rs.ok:
print(f"### 성공 code:{rs.status_code}")
else:
print(f"### 실패 code:{rs.status_code} reason:{rs.reason} msg:{rs.text}")

81
main_naver_blog_html.py Normal file
View File

@ -0,0 +1,81 @@
import requests
from bs4 import BeautifulSoup
from markdownify import markdownify as md
from package import GetConfig, MariaDB, ChangeTextToPost, WordPress
import markdown
import re
# 현재 DB연동 기능은 없음. 실행시키면 URL을 넣어야동작합니다.
def get_naver_blog_content_as_markdown(url):
# 네이버 블로그의 모바일 버전으로 리다이렉트
mobile_url = url.replace("blog.naver.com", "m.blog.naver.com")
# 웹브라우저 위장 --------------------------------------------------
# 제외 하여도 이상 없이 동작하여 제외.
# headers = {
# "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
# }
# response = requests.get(mobile_url, headers=headers)
# ---------------------------------------------------------------
response = requests.get(mobile_url)
if response.status_code != 200:
print(f"Failed to fetch the page: {response.status_code}")
return None
# BeautifulSoup으로 HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# 본문 추출 (모바일 버전의 본문 클래스 사용)
content = soup.find("div", class_="se-main-container")
if content:
html_content = str(content)
markdown_content = md(html_content) # HTML → Markdown 변환
# 빈 줄 제거
markdown_content = "\n".join([line for line in markdown_content.splitlines() if line.strip()])
return markdown_content
else:
print("Failed to extract the blog content.")
return None
# 2024-10-03 환경 변수 호출
print('### Get values From .env')
config = GetConfig()
dict_data = config.get_config_as_dict()
# 예제 URL
url = input("Enter your blog address : ")
# markdown_content = get_naver_blog_content_as_markdown(url)
post_article = get_naver_blog_content_as_markdown(url)
post_article = post_article.replace(">", "###")
post_article = re.sub(r"^\[!\[\].*?\]", "#### 이미지", post_article, flags=re.MULTILINE)
# if markdown_content:
# print("Markdown Content:")
# print(markdown_content)
# Markdown 파일로 저장
# with open("blog_content.md", "w", encoding="utf-8") as file:
# file.write(markdown_content)
# print("Blog content saved as blog_content.md")
print('### Convert to HTML - markdown to html')
# 2024-10-03 Markdown을 HTML로 변환
html = markdown.markdown(post_article)
# 2024-10-03 워드프레스 포스팅 임시등록
print('### Create post')
wp = WordPress(dict_data)
rs = wp.create_post(2,html)
if __name__ == "__main__":
# print(post_article)
print("추가 확인을 위한 출력")
if rs.ok:
print(f"### 성공 code:{rs.status_code}")
else:
print(f"### 실패 code:{rs.status_code} reason:{rs.reason} msg:{rs.text}")

View File

@ -1,50 +0,0 @@
import os
from openai import OpenAI
from dotenv import load_dotenv
import translate_article as ta
# .env 파일에서 API 키 로드
load_dotenv()
client = OpenAI(
api_key=os.getenv("OPENAI_API_KEY"),
)
def generate_blog_post(article_text, style_reference):
# ChatCompletion API를 사용하여 텍스트 변환
prompt = (
"""
너는 대한민국에 거주하는 블로그 전문가이다.
네가 작성한 블로그 글은 지난 3년간 높은 주목성, 관여도, 전환율을 만들었다.
이 전문성을 이용해서 제공받는 기사를 블로그 형태로 변형하여 작성해야만 한다.
"""
f"\n블로그 스타일은 아래 문서를 모방해줘. 적절한 사례들이 들어가도 좋겠어.\n---\n{style_reference}\n"
f"제공된 기사 내용:\n{article_text}"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "너는 대한민국에 거주하는 블로그 전문가이다."},
{"role": "user", "content": prompt}
],
max_tokens=2000,
temperature=0.7, # 창의성을 위한 적절한 값 조정
)
# 응답에서 텍스트 추출
blog_post = response.choices[0].message.content
return blog_post
# 예시 기사 텍스트 (text 파싱 결과로 제공된 텍스트를 사용할 수 있습니다)
article_text = ta.getContents()
# 블로그 스타일 참고 텍스트
style_reference = os.getenv("reference_style")
# 블로그 포스트 생성
blog_post = generate_blog_post(article_text, style_reference)
# 결과 출력
print(">>>>\n" * 3)
print(blog_post)

49
package/GetConfig.py Normal file
View File

@ -0,0 +1,49 @@
import os
from dotenv import load_dotenv
# 우선순위: .env.prd > .env.dev > .env
if os.path.exists('.env.prd'):
print("Read ::: .env.prd")
load_dotenv('.env.prd')
elif os.path.exists('.env.dev'):
print("Read ::: .env.dev")
load_dotenv('.env.dev')
else:
print("Read ::: .env")
load_dotenv('.env') # 기본 .env 파일
class GetConfig:
def __init__(self):
self.db_host = os.getenv('DB_HOST')
self.db_user = os.getenv('DB_USER')
self.db_pw = os.getenv('DB_PASSWORD')
self.db_database = os.getenv('DB_NAME')
self.openai_api_key = os.getenv('OPENAI_API_KEY')
self.wp_url = os.getenv('WP_URL')
self.wp_user = os.getenv('WP_USER')
self.wp_api_key = os.getenv('WP_API_KEY')
self.wp_post_style = os.getenv('WP_POST_STYLE')
def show_config(self):
for key, value in self.__dict__.items():
print(f"{key.upper()}: {value}")
def get_config_as_dict(self):
# 인스턴스 속성을 딕셔너리로 반환
return self.__dict__
if __name__ == "__main__":
# 결과 확인
config = GetConfig()
config.show_config()
# 오랜만에 보다보니 헷갈려서 참고용으로 작성
# class GetConfig:
# def __init__(self, name=None):
# self.name = name if name is not None else "default_name"
# self.host = os.getenv('DB_HOST')
# self.user = os.getenv('DB_USER')
# self.password = os.getenv('DB_PASSWORD')
# self.database = os.getenv('DB_NAME')

56
package/MariaDB.py Normal file
View File

@ -0,0 +1,56 @@
import mysql.connector
class MariaDB:
def __init__(self, dict):
self.db_host = dict['db_host']
self.db_user = dict['db_user']
self.db_pw = dict['db_pw']
self.db_database = dict['db_database']
def show_config(self):
for key, value in self.__dict__.items():
print(f"{key.upper()}: {value}")
def fetch_data_from_mariadb(self):
try:
# 데이터베이스 연결
connection = mysql.connector.connect(
host=self.db_host,
user=self.db_user,
password=self.db_pw,
database=self.db_database
)
# 커서 생성
cursor = connection.cursor(dictionary=True)
# 쿼리 실행
query = "SELECT * FROM healty_url_source ORDER BY idx DESC LIMIT 1;"
cursor.execute(query)
# 결과 가져오기
result = cursor.fetchone()
return result
except mysql.connector.Error as err:
print(f"Error: {err}")
finally:
if connection.is_connected():
cursor.close()
connection.close()
if __name__ == "__main__":
import GetConfig
config = GetConfig.GetConfig()
# config 잘 가져오는지 확인
config_dict = config.get_config_as_dict()
# MariaDB 테스트
dbg = MariaDB(config_dict)
# dbg.show_config()
url = dbg.fetch_data_from_mariadb()
print(url['url'])

37
package/OpenAI.py Normal file
View File

@ -0,0 +1,37 @@
from openai import OpenAI
class ChangeTextToPost:
def __init__(self, key):
self.client = OpenAI(
api_key=key,
)
def generate_blog_post(self, origin_content, wp_post_style):
# ChatCompletion API를 사용하여 텍스트 변환
prompt = (
"""
너는 대한민국에 거주하는 블로그 전문가이다.
네가 작성한 블로그 글은 지난 3년간 높은 주목성, 관여도, 전환율을 만들었다.
이 전문성을 이용해서 제공받는 기사를 블로그 형태로 변형하여 작성해야만 한다.
---
글을 작성하고 제목을 만들어서 맨 마지막 줄에 추가해줘.
"""
f"\n블로그 스타일은 아래 문서를 모방해줘. markdown형태로 작성되어야한다. 적절한 사례들이 들어가도 좋겠어.\n---\n{wp_post_style}\n"
f"제공된 기사 내용:\n{origin_content}"
)
response = self.client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "너는 대한민국에 거주하는 블로그 전문가이다."},
{"role": "user", "content": prompt}
],
max_tokens=2000,
temperature=0.7, # 창의성을 위한 적절한 값 조정
)
# 응답에서 텍스트 추출
blog_post = response.choices[0].message.content
return blog_post

69
package/Utility.py Normal file
View File

@ -0,0 +1,69 @@
import requests, json
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from datetime import datetime
def getContents(url):
# ✅ User-Agent 헤더 추가 (403 방지용)
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/113.0.0.0 Safari/537.36'
}
# HTTP GET 요청으로 페이지 가져오기
response = requests.get(url, headers=headers)
# 응답 상태 확인
if response.status_code == 200:
# HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# HTML 태그를 제거 후 페이지의 모든 텍스트 가져오기 (전체 내용)
page_content = soup.get_text()
print("### url DEBUG ###")
print(page_content)
# 빈 줄을 제거하고 텍스트만 출력 (줄바꿈 문자를 기준으로 필터링)
lines = [line.strip() for line in page_content.splitlines() if line.strip()]
# 결과 출력
contents = "\n".join(lines)
return contents
else:
print(f"Failed to fetch the URL. Status code: {response.status_code}")
class WordPress():
def __init__(self, dict):
self.wp_url = dict['wp_url']
self.wp_user = dict['wp_user']
self.wp_api_key = dict['wp_api_key']
def create_post(self, category_id, content, media_id = None, status = "draft", title="파이썬 자동 포스팅"):
payload = {
"status": status, # publish / draft
"title": title,
"content": content,
"date": datetime.now().isoformat(), # YYYY-MM-DDTHH:MM:SS
"categories": category_id
}
if media_id is not None:
payload['featured_media'] = media_id
return requests.post(urljoin(self.wp_url, "wp-json/wp/v2/posts"),
data=json.dumps(payload),
headers={'Content-type': "application/json"},
auth=(self.wp_user, self.wp_api_key))
# if result.ok:
# print(f"성공 code:{result.status_code}")
# else:
# print(f"실패 code:{result.status_code} reason:{result.reason} msg:{result.text}")
if __name__ == "__main__":
# url = 'https://www.hani.co.kr/arti/science/science_general/1161001.html'
# tmp = getContents(url)
# print(tmp)
pass

5
package/__init__.py Normal file
View File

@ -0,0 +1,5 @@
from .Utility import getContents
from .Utility import WordPress
from .GetConfig import GetConfig
from .MariaDB import MariaDB
from .OpenAI import ChangeTextToPost

3
pyvenv.cfg Normal file
View File

@ -0,0 +1,3 @@
home = /usr/bin
include-system-site-packages = false
version = 3.10.12

Binary file not shown.

9
sample.env.dev Normal file
View File

@ -0,0 +1,9 @@
DB_HOST=192.168.0.1
DB_USER=demo
DB_PASSWORD=demo
DB_NAME=demo
OPENAI_API_KEY=demo
WP_URL='https://www.example.com'
WP_USER='demo'
WP_API_KEY='demo'
WP_POST_STYLE="문장" # OpenAI에 사용되는것.

2
tempCodeRunnerFile.py Normal file
View File

@ -0,0 +1,2 @@
if rs.ok:

View File

@ -1,30 +0,0 @@
import requests
from bs4 import BeautifulSoup
import get_url
url = get_url.fetch_data_from_mariadb()['url']
def getContents():
# HTTP GET 요청으로 페이지 가져오기
response = requests.get(url)
# 응답 상태 확인
if response.status_code == 200:
# HTML 파싱
soup = BeautifulSoup(response.text, 'html.parser')
# HTML 태그를 제거 후 페이지의 모든 텍스트 가져오기 (전체 내용)
page_content = soup.get_text()
# 빈 줄을 제거하고 텍스트만 출력 (줄바꿈 문자를 기준으로 필터링)
lines = [line.strip() for line in page_content.splitlines() if line.strip()]
# 결과 출력
contents = "\n".join(lines)
return contents
else:
print(f"Failed to fetch the URL. Status code: {response.status_code}")
tmp = getContents()
print(tmp)

1
version Normal file
View File

@ -0,0 +1 @@
0.1.2